This event has passed.

CANSSI Prairies Workshop: From Classical NLP to Large Language Models: Concepts, Architectures, and Practical Demonstrations

Name: CANSSI Prairies Workshop: From Classical NLP to Large Language Models: Concepts, Architectures, and Practical Demonstrations
Start: 2026-02-07T00:00:00-06:00
End: 2026-02-07T23:59:59-06:00
Location: University of Manitoba (Fort Garry Campus)

February 7

Date: Saturday, February 7, 2026
Time: 9:00–16:00 Central Time
Place: University of Manitoba, Fort Garry Campus, Armes Building, Room 201

Workshop Description

This one-day workshop, titled “From Classical NLP to Large Language Models: Concepts, Architectures, and Practical Demonstrations,” is the fifth in the CANSSI Prairies Workshop Series in Data Science. It will be led by Lei Ding, Assistant Professor of Statistics at the University of Manitoba.

The workshop is intended for students, researchers, and professionals in statistics, computer science, and data science, as well as for individuals interested in understanding or applying Natural Language Processing (NLP) and Large Language Models (LLMs) in research or practice. No deep background in machine learning is required, although basic programming familiarity would be helpful.

Participants will gain a unified understanding of classical and modern NLP; insight into how LLMs learn, reason, and behave; practical code examples for embeddings and Retrieval-Augmented Generation (RAG); and a strong foundation for research or applied work involving LLMs.

By the end of the workshop, participants will:

Understand classical NLP representations and why they fail to capture semantics
Grasp the key innovations behind word embeddings
Learn the Transformer architecture and why it became the dominant model
Understand how LLMs are pretrained, instruction-tuned, and aligned with human feedback
See how models perform reasoning and why Chain-of-Thought (CoT) prompting can improve performance
Learn how retrieval and grounding improve model accuracy
Gain hands-on experience building small NLP and LLM workflows

We invite you to join us!

Cost and Registration

Students: $25
Non-students: $50

Program Schedule

Morning Sessions

9:00–10:30 | Session 1—Foundations of NLP and Embeddings

This session introduces traditional NLP techniques and motivates the shift toward dense vector representations. Topics include:

Bag-of-Words (BoW) and TF-IDF
Limitations of sparse representations: no order, no meaning
Transition to continuous embeddings
Word2Vec, GloVe, fastText
Semantic geometry: similarity and analogy reasoning

Outcome: Participants will understand how text becomes vectors and why embeddings transformed NLP.

10:30–10:45 | Break

10:45–12:00 | Session 2—Transformer Architecture and Pretraining

A focused introduction to the architecture underlying all modern LLMs. Topics include:

Self-attention mechanism
Multi-head attention
Positional encoding
Encoder vs. decoder structure
Pretraining objectives: next-token prediction, masked language modelling
Why scaling Transformers leads to emergent capabilities

Outcome: Participants will gain intuition for how Transformers operate and why they scale effectively.

12:00–13:00 | Lunch

Afternoon Sessions

13:00–14:30 | Session 3 — Large Language Models: Reasoning, Alignment, and Applications

This is the main conceptual session of the afternoon. Topics include:

What makes a model “large”?
Instruction tuning
Supervised fine-tuning (SFT)
Reinforcement Learning from Human Feedback (RLHF)
CoT prompting and why it improves reasoning performance
Hallucinations, grounding, and a brief introduction to RAG
Example: vanilla prompt vs. CoT prompt (live reasoning demo)

Outcome: Participants will understand how modern LLMs reason, how alignment works, and how prompting strategies affect output quality.

14:30–14:45 | Break

14:45–16:00 | Session 4—Live Coding Demonstration: Embeddings, Reasoning, and RAG

This hands-on session connects all concepts from the day with practical examples.
Live examples will include:

Generating text embeddings
Performing semantic similarity search
A minimal RAG pipeline
Demonstrating reasoning with and without Chain-of-Thought
A small end-to-end example: upload text → embed → retrieve → prompt → answer

Outcome: Participants will see how NLP and LLM systems are constructed in practice and leave with reproducible Python code.

About the Speaker

Lei Ding is an Assistant Professor in the Department of Statistics at the University of Manitoba. He previously held a postdoctoral position at the University of Alberta, where he also completed his PhD in Statistical Machine Learning in 2024. His research lies at the intersection of Large Language Models (LLMs), Natural Language Processing (NLP), and Statistical Learning. Dr. Ding has authored over 20 publications in leading international conferences and journals, including the Conference on Neural Information Processing Systems (NeurIPS), the International Conference on Machine Learning (ICML), the AAAI Conference on Artificial Intelligence, the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), and the Proceedings of the National Academy of Sciences Nexus (PNAS Nexus).

About the Series

The CANSSI Prairies Workshop Series in Data Science offers an excellent opportunity for individuals to enhance their knowledge and skills in various areas of data science. Through a series of engaging and interactive hybrid (online and in-person) sessions, participants have the opportunity to explore new topics, learn cutting-edge techniques, and connect with experts in the field.

Details

Date: February 7
Event Category: CANSSI Prairies

Organizer

CANSSI Prairies

Venue

University of Manitoba (Fort Garry Campus)
66 Chancellors Circle
Winnipeg, Manitoba R3T 2N2 Canada + Google Map