2023.10.12
Students from Professor Hyo-Pil Shin’s Linguistics and Computational Linguistics/Natural Language Processing Lab at Seoul National University have developed and released the DaG LLM (David and Goliath Large Language Model), trained using two POSEIDON Ultimate 4000U servers (equipped with 4 H100 GPUs each) and the BARO Cluster, provided by BARO AI.

* LLM(Large Language Model)? An artificial intelligence trained to understand and generate human language—ChatGPT being a prime example.
Currently, the LLM landscape is divided between closed models developed by major corporations such as OpenAI, Google, and Naver—like PaLM and Clova X—and open-source models developed by organizations like Meta and Eleuther, including LLaMA and Polyglot-ko. Building an LLM requires massive amounts of data and computing power, making it increasingly difficult for universities, research institutes, and small enterprises with limited resources to develop their own models.
DaG LLM was created by fine-tuning the open Korean language model Polyglot-Ko-5.8b, coupled with appropriate prompt injection techniques. It represents an attempt to rival the performance of closed LLMs from large tech companies. The model is tailored to specialized domains such as law, finance, and healthcare, and is expected to serve as a foundational step in advancing Korean-language LLM development.

The POSEIDON Ultimate Line-up used in this development is equipped with H100 GPUs, delivering top-tier performance for large-scale and complex deep learning, machine learning, high-performance data analysis, and large language model research that require extended training time. With GPU virtualization through MIG (Multi-Instance GPU), it accelerates isolated and partitioned workloads while maintaining maximum resource utilization. As the first server in Korea to feature four H100 GPUs, it enables up to 9x performance acceleration in natural language processing research.
Try DaG LLM Learn moer: POSEIDON Ultimate Line-up
2023.10.12
Students from Professor Hyo-Pil Shin’s Linguistics and Computational Linguistics/Natural Language Processing Lab at Seoul National University have developed and released the DaG LLM (David and Goliath Large Language Model), trained using two POSEIDON Ultimate 4000U servers (equipped with 4 H100 GPUs each) and the BARO Cluster, provided by BARO AI.
* LLM(Large Language Model)? An artificial intelligence trained to understand and generate human language—ChatGPT being a prime example.
Currently, the LLM landscape is divided between closed models developed by major corporations such as OpenAI, Google, and Naver—like PaLM and Clova X—and open-source models developed by organizations like Meta and Eleuther, including LLaMA and Polyglot-ko. Building an LLM requires massive amounts of data and computing power, making it increasingly difficult for universities, research institutes, and small enterprises with limited resources to develop their own models.
DaG LLM was created by fine-tuning the open Korean language model Polyglot-Ko-5.8b, coupled with appropriate prompt injection techniques. It represents an attempt to rival the performance of closed LLMs from large tech companies. The model is tailored to specialized domains such as law, finance, and healthcare, and is expected to serve as a foundational step in advancing Korean-language LLM development.
The POSEIDON Ultimate Line-up used in this development is equipped with H100 GPUs, delivering top-tier performance for large-scale and complex deep learning, machine learning, high-performance data analysis, and large language model research that require extended training time. With GPU virtualization through MIG (Multi-Instance GPU), it accelerates isolated and partitioned workloads while maintaining maximum resource utilization. As the first server in Korea to feature four H100 GPUs, it enables up to 9x performance acceleration in natural language processing research.
Try DaG LLM Learn moer: POSEIDON Ultimate Line-up