Chung-Yu Wang 💪
Chung-Yu Wang

I am currently a graduate student of Computer Science (M.Sc) at York University (YorkU) in Canada supervised by Professor Hung Viet Pham. My research interests include but are not limited to Software Engineering (AI4SE), Natural Language Processing, and Machine Learning. More specifically, I focus on optimizing LLMs and FMs for software engineering tasks, such as prompt engineering for code generation.

Resume

Education

  1. MSc in Computer Science

    York University, Canada
  2. BSc in Information Management

    National University of Kaohsiung, Taiwan
Featured Projects

Check out my featured projects below!

Publications
Selection of Prompt Engineering Techniques for Code Generation through Predicting Code Complexity
Arxiv (Under review) ∙ September 2024
PET-Select is a PET-agnostic model designed to improve the accuracy of code generation by selecting the most appropriate prompt engineering technique (PET) based on code complexity. It uses contrastive learning to distinguish between simple and complex queries, enabling more effective PET selection. Evaluations show PET-Select improves pass@1 accuracy by up to 1.9% and reduces token usage by 74.8%, optimizing the code generation process across benchmarks.
Deep-Bench: Deep Learning Benchmark Dataset for Code Generation
Arxiv (Under review) ∙ Present
Deep-Bench is a new benchmark for function-level deep learning (DL) code generation, designed to cover the full DL pipeline across phases, tasks, and data types—unlike prior benchmarks like DS-1000, which focus narrowly on pre/post-processing. Leading LLMs such as GPT-4o achieve significantly lower accuracy on DeepBench (31% vs. 60% on DS-1000), highlighting its greater complexity. Our analysis reveals substantial performance variation across categories and common bugs in LLM-generated DL code, offering valuable insights into current limitations and future improvements.
Task-oriented Prompt Enhancement via Script Generation
31st Conference on Knowledge Discovery and Data Mining Workshop (KDD'25 Workshop) ∙ April 2024
TITAN is a novel strategy designed to enhance large language models’ (LLMs) performance on task-oriented prompts by using a universal, zero-shot approach. It eliminates the need for task-specific instructions and manual efforts by leveraging step-back and chain-of-thought prompting techniques to refine the code-generation process. In evaluations, TITAN outperforms existing zero-shot methods, achieving state-of-the-art performance in 8 out of 11 tasks, offering a significant improvement in handling everyday task-oriented prompts.
Can ChatGPT Support Developers? An Empirical Evaluation of Large Language Models for Code Generation
21st International Conference on Mining Software Repositories (MSR’24) ∙ March 2024
Large language models (LLMs) have shown promise in code generation, but existing studies focus mainly on research settings, leaving gaps in understanding their real-world utility. An empirical analysis of developer conversations from the DevGPT dataset reveals that LLM-generated code is primarily used for demonstrating concepts or examples rather than as production-ready code. These findings highlight the need for further improvements before LLMs can play a significant role in modern software development.
High-efficiency classification of injured causes on agricultural jujubes using EfficentNet
26th International Conference on Technologies and Applications of Artificial Intelligence (TAAI) ∙ November 2021