MathSuccess Dataset Analysis

Overview

This project analyzes the MathSuccess dataset, which contains data on student performance, including their scores from various entrance exams and their final course success. The goal is to compare multiple machine learning algorithms to determine which model best predicts student success.

Dataset Description

The MathSuccess dataset consists of approximately 2600 rows with the following columns:

Student: Identifier for each student.
Gender: Gender of the student.
PSATM: Score from the PSAT exam.
SATM: Score from the SAT exam.
ACTM: Score from the ACT exam.
Rank: Academic rank of the student.
Size: Size of the school.
GPAadj: Adjusted GPA.
PlcmtScore: Placement score.
Recommends: Recommendations received.
Course: Course taken by the student.
Grade: Grade received in the course.
RecTaken: Recommendations taken into account.
TooHigh/TooLow: Indicators for admission test scores.
CourseSuccess: Outcome variable indicating whether the student passed or failed.

Preprocessing Steps

Removed rows with missing values in the CourseSuccess column.
Replaced missing values in other columns with zero.
Converted categorical variables (Recommends, Grade, CourseSuccess) into factors.

Algorithms Used

The following machine learning algorithms were implemented:

CTree (Classification and Regression Trees)
RPart (Recursive Partitioning)
Support Vector Machine (SVM)
Random Forest
Neural Network

Each algorithm was chosen based on its strengths in handling classification tasks and its ability to provide insights into student performance.

Results

The performance metrics for each algorithm are summarized below:

Algorithm	Accuracy	Precision	Recall	F1 Score	Error Rate
C_Tree	0.9953	1.0000	0.9931	0.9965	0.0047
R_Part	1.0000	1.0000	1.0000	1.0000	0.0000
SVM	0.9812	1.0000	0.9730	0.9863	0.0188
Random Forest	0.9930	1.0000	0.9897	0.9948	0.0070
Neural Network	0.9953	1.0000	0.9931	0.9965	0.0047

Installation

To run this project, ensure you have R and RStudio installed on your machine, along with the following libraries:

install.packages(c("rpart", "party", "e1071", "randomForest", "nnet", "caTools"))

Usage

Clone this repository and run the code in RStudio to replicate the analysis:

git clone https://github.com/yourusername/MathSuccessAnalysis.git
cd MathSuccessAnalysis

Open analysis.R in RStudio and execute the code to see results.

Contributing

Contributions are welcome! Please feel free to submit a pull request or open an issue for any suggestions or improvements.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Feel free to adjust any sections according to your preferences or add any additional information that may be relevant to your project!

Citations: [1] https://ppl-ai-file-upload.s3.amazonaws.com/web/direct-files/23864058/2f3d7bfb-4067-492b-aa36-82edb5a9f5f2/Screenshot-2024-11-05-at-8.16.54-PM.pdf

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Code		Code
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MathSuccess Dataset Analysis

Overview

Table of Contents

Dataset Description

Preprocessing Steps

Algorithms Used

Results

Installation

Usage

Contributing

License

About

Uh oh!

Releases

Packages

AayushPrranav/Comparative-Analysis-of-Machine-Learning-Algorithms-on-the-MathSuccess-Dataset

Folders and files

Latest commit

History

Repository files navigation

MathSuccess Dataset Analysis

Overview

Table of Contents

Dataset Description

Preprocessing Steps

Algorithms Used

Results

Installation

Usage

Contributing

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages