Abstract: ChatGPT and DeepSeek represent two prominent large language models, each offering unique strengths in artificial intelligence applications. ChatGPT is widely known for its advanced conversational abilities and broad language understanding, while DeepSeek is recognized for its strong performance in computational and technical domains. This research paper presents a comparative evaluation of DeepSeek-R1 and ChatGPT across several prominent mathematical and algorithmic benchmarks. The analysis reveals that both models exhibit strong and competitive performance, with each demonstrating unique strengths depending on the benchmark. DeepSeek-R1 shows a slight advantage in advanced mathematical problem-solving, while ChatGPT excels in competitive programming and complex quantitative reasoning tasks. Although the overall performance of the two models is closely matched, notable differences emerge in specific areas, highlighting the importance of selecting the appropriate model based on the requirements of the task. These findings offer valuable insights for researchers and practitioners seeking to deploy large language models in mathematical and computational domains.

Keywords: ChatGPT, DeepSeek-R1, Large Language Models, Mathematical Reasoning, Benchmark Comparison, AI Performance Evaluation


PDF | DOI: 10.17148/IARJSET.2025.125378

Open chat