• Rezultati Niso Bili Najdeni

V tem poglavju naredim še primerjavo modelov iz literature do tega trenutka. Na voljo je omejeno število dokumentiranih pristopov, ki za prepoznavo mikro obra-znih izrazov uporabljajo modele globokega učenja. Pri tem ne obstaja standardna metodologija za izvedbo eksperimenta prepoznavanja mikro izrazov. Posledično so rezultati težje primerljivi, saj različni pristopi različno uporabljajo podatkovja in različno združujejo kategorije čustev v klasifikacijskem problemu.

Enega prvih poskusov modeliranja obraznih mikro izrazov z globokim učenjem so izvedli Kim et al. [18]. Uporabili so konvolucijsko nevronsko mrežo za izražanje visokonivojskih atributov iz prve, najizrazitejše in zadnje sličice v zaporedju sličic iz vzorcev CASME II, ter jih nato obdelali z LSTM povratno mrežo. Dosegli so natančnost 0.610.

Peng et al. [1] so uporabili kombinacijo optičnega toka na sosednjih sličicah in trodimenzionalne konvolucijske mreže, ki ima po 4 konvolucijske in združevalne sloje ter uporablja ločen tok za vzorce iz posamezne podatkovne baze, ki se nato združita.

Razlog za tako izbiro arhitekture je v tem, da je vsaka od podatkovnih baz snemana pri različnem številu sličic na sekundo in niso želeli uporabiti enakih sprejemnih polj v časovni dimenziji. Poleg tega so podatke predhodno vzorčili in umetno obogatili, enako idejo, opisano v poglavju 4.6, smo uporabili tudi mi. Na modelu, ki za učenje in testiranje uporablja združene vzorce iz podatkovnih baz CASME I in CASME II, so dosegli natančnost prepoznave 0.667, medtem ko podatka o metriki F1 ni.

V viru Enriched Long-term Recurrent Convolutional Network for Facial Micro-Expression Recognition [21] uporabljajo preneseno učenje iz modela VGG-16 Deep Face [53] in povratno mrežo LSTM z natančnostjo 0.524. Niso koristili vzorčenja, vendar v učenje modela podajali neravnovesno množico. Izvedli so tudi poskus z učenjem na podatkovju CASME II in testiranjem na podatkovju SAMM z natančno-stjo 0.434 ter poskus s hkratnim učenjem na združenih podatkovjih z natančnonatančno-stjo 0.570.

Morda najzanimivejši pristop k problemu prepoznavanja mikro izrazov pa ponudi From Macro to Micro Expression Recognition: Deep Learning on Small Datasets

5.6. Pregled rezultatov modelov iz sorodnih raziskav

Using Transfer Learning [22], kjer so najprej naučili model za prepoznavo makro izrazov, saj za ta problem obstajajo precej širše podatkovne množice, ter ga nato doučili s podatkovji za mikro izraze, katere smo uporabljali tudi mi. S takšnim pristopom so dosegli visoko natančnost tako na podatkovni množici CASME II (na-tančnost 0.757, F1 0.650), kot tudi na podatkovni množici SAMM (na(na-tančnost 0.706, F1 0.540).

Poglavje 6 Zaključek

V magistrskem delu smo raziskali algoritemsko prepoznavo mikro obraznih izrazov s pomočjo metod globokega učenja t.j. globokih nevronskih mrež. Preizkusili smo ne-kaj različnih pristopov, modelirali mikro obrazne izraze z raznolikimi arhitekturami konvolucijskih ter LSTM nevronskih mrež in z omejenimi podatkovnimi množicami dosegli primerljive rezultate večini sorodnih raziskav do tega trenutka. Ugotavljamo, da je uporaba globokega učenja za tovrstne namene primerna, vendar bi k doda-tni uspešnosti avtomatske prepoznave mikro obraznih izrazov pripomoglo bogatejše podatkovje za učenje modelov.

Tehnologija za prepoznavanje mikro obraznih izrazov danes še ni dovolj daleč, da bi jo lahko uporabili v praktičnih aplikacijah v vsakdanjem življenju. Vendarle se zdi, da bi lahko v roku nekaj let ob zadostni podpori raziskovalne skupnosti napre-dovala do točke, kjer bo za nekatere aplikacije natančnost dovolj dobra. Na primer, predstavljajmo si, da za potrebe v oglaševalski panogi sprednja kamera na našem telefonu ves čas spremlja mikro odzive našega obraza na predvajane oglase in tako podrobno spremlja, kateri nam niso všeč in kateri malo bolj. Tehnologija prepoznave mikro obraznih izrazov v takšnem okolju po našem mnenju prinaša nevaren vdor v posameznikovo zasebnost in si posledično zasluži pozornost širše javnosti in pravnih organov.

Literatura

[1] M. Peng, C. Wang, T. Chen, G. Liu, and X. Fu, “Dual temporal scale convoluti-onal neural network for micro-expression recognition,” Frontiers in Psychology, vol. 8, p. 1745, 2017.

[2] L. Josephs, “Book reviews: Emotions revealed: Recognizing faces and feelings to improve communication and emotional life, by paul ekman. henry holt and company, new york, 2004, 274 pp,” The American Journal of Psychoanalysis, vol. 65, pp. 409–411, 12 2005.

[3] B. Fasel and J. Luettin, “Automatic facial expression analysis: A survey,” 01 1999.

[4] L. Zhang and D. Tjondronegoro, “Facial expression recognition using facial movement features,” IEEE Trans. Affect. Comput., vol. 2, pp. 219–229, Oct.

2011.

[5] X. Li, T. Pfister, X. Huang, G. Zhao, and M. Pietikainen, “A spontaneous micro-expression database: Inducement, collection and baseline,” in 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), vol. 00, pp. 1–6, 04 2013.

[6] Y. Li, S. Wang, Y. Zhao, and Q. Ji, “Simultaneous facial feature tracking and facial expression recognition,” IEEE Transactions on Image Processing, vol. 22, pp. 2559–2573, 2013.

[7] S. Porter and L. ten Brinke, “Reading between the lies,” Psychological science, vol. 19, pp. 508–14, 06 2008.

Literatura

[8] P. Ekman, “Lie catching and microexpressions,” inThe Philosophy of Deception (C. W. Martin, ed.), pp. 118–133, Oxford University Press, 2009.

[9] E. A. Haggard and K. S. Isaacs, Micromomentary facial expressions as indica-tors of ego mechanisms in psychotherapy, pp. 154–165. Boston, MA: Springer US, 1966.

[10] P. Ekman and W. V. Friesen, “Nonverbal leakage and clues to deception,”

Psychiatry, vol. 32, no. 1, pp. 88–106, 1969. PMID: 27785970.

[11] H. S. K. Frank, M. G. and Nolan, “I see how you feel: training laypeople and professionals to recognize fleeting emotions,” The Annual Meeting of the International Communication Association, 2009.

[12] W. Merghani, A. K. Davison, and M. H. Yap, “A review on fa-cial micro-expressions analysis: Datasets, features and metrics,” CoRR, vol. abs/1805.02397, 2018.

[13] S. Polikovsky, Y. Kameda, and Y. Ohta, “Facial micro-expression detection in hi-speed video based on facial action coding system (facs),” IEICE Transactions on Information and Systems, vol. E96.D, no. 1, pp. 81–92, 2013.

[14] T. Pfister, X. Li, G. Zhao, and M. Pietikäinen, “Differentiating spontaneous from posed facial expressions within a generic facial expression recognition fra-mework,” in 2011 IEEE International Conference on Computer Vision Wor-kshops (ICCV WorWor-kshops), pp. 868–875, Nov 2011.

[15] Y. Liu, J. Zhang, W. Yan, S. Wang, G. Zhao, and X. Fu, “A main directional mean optical flow feature for spontaneous micro-expression recognition,” IEEE Transactions on Affective Computing, vol. 7, pp. 299–310, Oct 2016.

[16] A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei,

“Large-scale video classification with convolutional neural networks,” in 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732, June 2014.

[17] A. Kamel, B. Sheng, P. Yang, P. Li, R. Shen, and D. D. Feng, “Deep convo-lutional neural networks for human action recognition using depth maps and

Literatura

postures,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, pp. 1–14, 2018.

[18] D. H. Kim, W. J. Baddar, and Y. M. Ro, “Micro-expression recognition with expression-state constrained spatio-temporal feature representations,” in Pro-ceedings of the 24th ACM International Conference on Multimedia, MM ’16, (New York, NY, USA), pp. 382–386, ACM, 2016.

[19] X.-l. Hao and M. Tian, “Deep belief network based on double weber local de-scriptor in micro-expression recognition,” pp. 419–425, 05 2017.

[20] F. Cheng, J. Yu, and H. Xiong, “Facial expression recognition in jaffe dataset based on gaussian process classification,” Trans. Neur. Netw., vol. 21, pp. 1685–

1690, Oct. 2010.

[21] H.-Q. Khor, J. See, R. C. Phan, and W. Lin, “Enriched long-term recurrent convolutional network for facial micro-expression recognition,” arXiv preprint arXiv:1805.08417, 2018.

[22] M. Peng, W. Zhan, Z. Zhang, and T. Chen, “From macro to micro expression recognition: Deep learning on small datasets using transfer learning,” pp. 657–

661, 05 2018.

[23] S. Polikovsky, Y. Kameda, and Y. Ohta, “Facial micro-expressions recognition using high speed camera and 3d-gradient descriptor,” in3rd International Con-ference on Imaging for Crime Detection and Prevention (ICDP 2009), pp. 1–6, Dec 2009.

[24] W.-J. Yan, Q. Wu, Y.-J. Liu, S.-J. Wang, and X. Fu, “Casme database: a dataset of spontaneous micro-expressions collected from neutralized faces,” 04 2013.

[25] X. Li, T. Pfister, X. Huang, G. Zhao, and M. Pietikäinen, “A spontaneous micro-expression database: Inducement, collection and baseline,” in 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), pp. 1–6, April 2013.

Literatura

[26] W.-J. Yan, X. Li, S.-J. Wang, G. Zhao, Y.-J. Liu, Y.-H. Chen, and X. Fu, “Ca-sme ii: An improved spontaneous micro-expression database and the baseline evaluation,” PLOS ONE, vol. 9, pp. 1–8, 01 2014.

[27] A. K. Davison, C. Lansley, N. Costen, K. Tan, and M. Yap, “Samm: A sponta-neous micro-facial movement dataset,” IEEE Transactions on Affective Com-puting, vol. 9, pp. 116–129, jan 2018.

[28] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press, 2016.

http://www.deeplearningbook.org.

[29] T. M. Mitchell, Machine Learning. McGraw-Hill, 1997.

[30] “Activation functions: Neural networks.” https://towardsdatascience.com/

activation-functions-neural-networks-1cbd9f8d91d6. Accessed: 2019-01-17.

[31] V. Nair and G. E. Hinton, “Rectified linear units improve restricted boltzmann machines,” pp. 807–814, 2010.

[32] D. R. Wilson and T. R. Martinez, “The general inefficiency of batch training for gradient descent learning,” Neural Netw., vol. 16, pp. 1429–1451, Dec. 2003.

[33] “Gradient descent algorithm and its variants.” https://towardsdatascience.

com/gradient-descent-algorithm-and-its-variants-10f652806a3. Accessed: 2019-01-17.

[34] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” CoRR, vol. abs/1412.6980, 2014.

[35] M. D. Zeiler, “Adadelta: An adaptive learning rate method,” 2012.

[36] A. Graves, “Generating sequences with recurrent neural networks,” 2013.

[37] “Willamette university cs449.” http://www.willamette.edu/-gorr/classes/

cs449/momrate.html. Accessed: 2018-08-12.

[38] F. Altenberger and C. Lenz, “A non-technical survey on deep convolutional neural network architectures,” CoRR, vol. abs/1803.02129, 2018.

Literatura

[39] “Understanding lstm networks.” http://colah.github.io/posts/

2015-08-Understanding-LSTMs/. Accessed: 2019-01-17.

[40] Y. Bengio, P. Simard, and P. Frasconi, “Learning long-term dependencies with gradient descent is difficult,” Trans. Neur. Netw., vol. 5, pp. 157–166, Mar.

1994.

[41] R. Pascanu, T. Mikolov, and Y. Bengio, “Understanding the exploding gradient problem,” CoRR, vol. abs/1211.5063, 2012.

[42] K. Greff, R. K. Srivastava, J. Koutník, B. R. Steunebrink, and J. Schmidhuber,

“LSTM: A search space odyssey,” CoRR, vol. abs/1503.04069, 2015.

[43] D. E. King, “Dlib-ml: A machine learning toolkit,” J. Mach. Learn. Res., vol. 10, pp. 1755–1758, Dec. 2009.

[44] “Face++ cognitive services.” www.faceplusplus.com. Accessed: 2018-08-15.

[45] S.-T. Liong, J. See, K. Wong, and R. C.-W. Phan, “Less is more: Micro-expression recognition from video using apex frame,” Signal Processing: Image Communication, vol. 62, pp. 82 – 92, 2018.

[46] K. Simonyan and A. Zisserman, “Two-stream convolutional networks for action recognition in videos,” CoRR, vol. abs/1406.2199, 2014.

[47] D. Tran, L. D. Bourdev, R. Fergus, L. Torresani, and M. Paluri, “C3D: generic features for video analysis,” CoRR, vol. abs/1412.0767, 2014.

[48] Q. Li, J. Yu, T. Kurihara, and S. Zhan, “Micro-expression analysis by fusing deep convolutional neural network and optical flow,” pp. 265–270, 04 2018.

[49] U. Gargi, R. Kasturi, and S. H. Strayer, “Performance characterization of video-shot-change detection methods,” IEEE Trans. Cir. and Sys. for Video Technol., vol. 10, pp. 1–13, Feb. 2000.

[50] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané,

Literatura

R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng, “Ten-sorFlow: Large-scale machine learning on heterogeneous systems,” 2015. Soft-ware available from tensorflow.org.

[51] F. Chollet et al., “Keras.” https://keras.io, 2015.

[52] T. Oliphant, “NumPy: A guide to NumPy.” USA: Trelgol Publishing, 2006–.

[Online; accessed <today>].

[53] O. M. Parkhi, A. Vedaldi, and A. Zisserman, “Deep face recognition,” in British Machine Vision Conference, 2015.