I am a Ph.D. candidate and RA&TA at the University of Miami College of Engineering Electrical and Computer Engineering Department.

My research interest includes Deep Learning, Bioinformatics, Artificial Intelligence, Engineering Management. I have published some papers at conferences and journals with total google scholar citations 87 (You can also use google scholar badge ).

🔥 News

Briefings in Bioinformatics
sym

Deep Contrastive Learning for Predicting Cancer Prognosis Using Gene Expression Values

Anchen Sun, Elizabeth J. Franzmann, Zhibin Chen, Xiaodong Cai

Abstract Recent advancements in image classification have demonstrated that contrastive learning (CL) can aid in further learning tasks by acquiring good feature representation from a limited number of data samples. In this paper, we applied CL to tumor transcriptomes and clinical data to learn feature representations in a low-dimensional space. We then utilized these learned features to train a classifier to categorize tumors into a high- or low-risk group of recurrence. Using data from The Cancer Genome Atlas (TCGA), we demonstrated that CL can significantly improve classification accuracy. Specifically, our CL-based classifiers achieved an area under the receiver operating characteristic curve (AUC) greater than 0.8 for 14 types of cancer, and an AUC greater than 0.9 for 3 types of cancer. We also developed CL-based Cox (CLCox) models for predicting cancer prognosis. Our CLCox models trained with the TCGA data outperformed existing methods significantly in predicting the prognosis of 19 types of cancer under consideration. The performance of CLCox models and CL-based classifiers trained with TCGA lung and prostate cancer data were validated using the data from two independent cohorts. We also show that the CLCox model trained with the whole transcriptome significantly outperforms the Cox model trained with the 16 genes of Oncotype DX that is in clinical use for breast cancer patients. The trained models and the Python codes are publicly accessible and provide a valuable resource that will potentially find clinical applications for many types of cancer.

IEEE ICDL 2024 Full Oral Presentation
sym

Who Said What? An Automated Approach to Analyzing Speech in Preschool Classrooms

Anchen Sun, Juan J Londono, Batya Elbaum, Luis Estrada, Roberto Jose Lazo, Laura Vitale, Hugo Gonzalez Villasanti, Riccardo Fusaroli, Lynn K Perry, Daniel S Messinger

Abstract Young children spend substantial portions of their waking hours in noisy preschool classrooms. In these environments, children’s vocal interactions with teachers are critical contributors to their language outcomes, but manually transcribing these interactions is prohibitive. Using audio from child- and teacher-worn recorders, we propose an automated framework that uses open source software both to classify speakers (ALICE) and to transcribe their utterances (Whisper). We compare results from our framework to those from a human expert for 110 minutes of classroom recordings, including 85 minutes from child-word microphones (n=4 children) and 25 minutes from teacher-worn microphones (n=2 teachers). The overall proportion of agreement, that is, the proportion of correctly classified teacher and child utterances, was .76, with an error-corrected kappa of .50 and a weighted F1 of .76. The word error rate for both teacher and child transcriptions was .15, meaning that 15% of words would need to be deleted, added, or changed to equate the Whisper and expert transcriptions. Moreover, speech features such as the mean length of utterances in words, the proportion of teacher and child utterances that were questions, and the proportion of utterances that were responded to within 2.5 seconds were similar when calculated separately from expert and automated transcriptions. The results suggest substantial progress in analyzing classroom speech that may support children’s language development. Future research using natural language processing is under way to improve speaker classification and to analyze results from the application of the automated framework to a larger dataset containing classroom recordings from 13 children and 3 teachers observed on 17 occasions over one year.

Journal and Conference Full Paper

Symposium, Conference Poster, and Presentation

🎖 Honors and Awards

📖 Educations

  • 2020.08 - Present, University of Miami, Doctor of Philosophy (Ph.D.) in Electrical and Computer Engineering (Advisor: Dr. Xiaodong Cai)
  • 2019.01 - 2020.05, University of Miami, Master of Science (M.S.) with thesis option in Electrical and Computer Engineering.
  • 2014.08 - 2017.12, University of Miami, Bachelor of Science (B.S.) in Marine Science and Computer Science with Minor in Mathematics

💬 Invited Talks

  • 2023.03, AI module, ECE114 Global Challenges addressed by Engineering and Tech.

💻 Internships

  • 2018.04 - 2018.08, Statistical Programmer, Biorasi LLC, FL, U.S.