BUS350 Data Analytics in R
Credits (ECTS):5
Course responsible:Dag Einar Sommervoll
Campus / Online:Taught campus Ås
Teaching language:Engelsk
Course frequency:Annually.
Nominal workload:125 hours. This is a work-intensive course.
Teaching and exam period:This course starts in the Autumn parallel. This course has teaching/evaluation in the Autumn parallel.
About this course
In this course you will learn to use R to solve common problems in data analysis. You will gain fundamental knowledge about data structures, analysis, and visualization.
Some of the key benefits of working in R are that all your work and analysis is fully reproducible, you can work with large datasets, continuous streams of data, utilize state-of-the-art modeling and data visualization techniques, and much more.
The course is divided into 4 parts:
- Data exploration (4 weeks)
- Data wrangling (4 weeks)
- Programming in R (3 weeks)
- Models in R (2 weeks)
And will cover the following topics
- Common data structures and data sources
- Common file formats and data import
- R and R Studio
- Quarto
- Data transformation
- Data and model visualization using `ggplot`
- Exploratory data analysis
- Programming concepts (Functions, vectors and iterations)
- Model building
A key learning outcome is effective written communication. You need to be able to communicate clearly about the choices you made before and during data gathering, cleaning and analysis; and you need to communicate the results using text, tables, and visualizations. You will learn to create reproducible reports and presentations using Quarto. If properly set up, you will see that all you have to when you get new data is to re-run your code to produce a new report with updated numbers and figures.
Participants in the course are expected to work continuously with the course and participate actively both in the seminars and on Canvas.
Learning outcome
Knowledge:
- Understand the properties of raw data structures and their implications for the use of data analysis techniques
- Be familiar with database structures and their implications for data management and data extraction
- Know important techniques for data preparation, transformation, aggregation and exploration
- Understand how pre-analysis choices, e.g., aggregating or dropping observations, affect analysis and interpretation of results
- Understand what compromises may be necessary in the data analysis process from raw data to discussion and presentation of results and how these may affect or bias the results
- Understand how programming can automate data analysis tasks, reduce errors and increase reproducibility of results.
Skills:
- Have basic skills in R and R Studio
- Be able to read in data from various sources and file formats, e.g., SQL databases, Excel, XML data streams
- Be able to take "messy" raw data and prepare it for analysis
- Be able to perform basic feature engineering tasks, e.g., variable selection, transformation, and aggregation
- Be able to create informative tables and visualizations of data and analysis results
- Create reproducible reports and presentations using Quarto
General competence:
- Effectively communicate the results of data analysis using text, tables, and visualizations
- Be able to build logical arguments and justify data- and analysis choices
- Be able to ask technical questions in a way that others can come in and help with the solution.
Learning activities
Teaching support
Prerequisites
Recommended prerequisites
Assessment method
Examiner scheme
Mandatory activity
Notes
Reduction of credits
Admission requirements