ABW505 Mock Exams (1+2): Python & Machine Learning (Verified Answers)
39 min read
0
๐ Exam Information
| Item | Details |
|---|---|
| Total Points | 100 |
| Time Allowed | 90 minutes |
| Format | Closed book, calculator allowed |
| Structure | Q1 (20pts) + Q2 (30pts, choose 3/5) + Q3 (25pts) + Q4 (25pts) |
Question 1: Python Output Analysis (20 points)
Answer ALL questions. Determine exact output.
Q1.1 (5 points)
x = 15
y = 4
print((x // y) ** 2 + x % y)๐ก Click to View Answer & Explanation
Step-by-step breakdown:
# Given values
x = 15
y = 4
# Step 1: Floor division x // y
# 15 // 4 = 3 (integer part of 15/4 = 3.75)
floor_result = 15 // 4 # = 3
# Step 2: Modulo x % y
# 15 % 4 = 3 (remainder when 15 divided by 4)
# 15 = 4 ร 3 + 3, so remainder is 3
mod_result = 15 % 4 # = 3
# Step 3: Power (floor_result) ** 2
# 3 ** 2 = 9
power_result = 3 ** 2 # = 9
# Step 4: Addition
# 9 + 3 = 12
final = 9 + 3 # = 12Answer: 12
Key operators explained:
| Operator | Name | Example |
|---|---|---|
// |
Floor division | 15 // 4 = 3 |
% |
Modulo | 15 % 4 = 3 |
** |
Exponentiation | 3 ** 2 = 9 |
Q1.2 (5 points)
numbers = [10, 20, 30, 40, 50]
numbers[1:4] = [100]
print(len(numbers))
print(numbers[2])๐ก Click to View Answer & Explanation
Step-by-step breakdown:
# Original list
numbers = [10, 20, 30, 40, 50]
# Indices: 0 1 2 3 4
# Slice assignment: numbers[1:4] = [100]
# This replaces elements at indices 1, 2, 3 with a single element 100
# Before: [10, 20, 30, 40, 50]
# ^^^^^^^^^^^^ <- indices 1:4 (elements 20, 30, 40)
# After: [10, 100, 50]
# ^^^ <- replaced with single element
# Result after slice assignment
# numbers = [10, 100, 50]
# Indices: 0 1 2
# len(numbers) = 3 (was 5, replaced 3 elements with 1)
# numbers[2] = 50 (third element)Answers:
len(numbers)โ3numbers[2]โ50
Important concept: Slice assignment can change list size! Here we replaced 3 elements (indices 1, 2, 3) with 1 element, reducing length from 5 to 3.
Q1.3 (5 points)
def mystery(a, b=5, c=10):
return a * 2 + b - c
result = mystery(3, c=4)
print(result)๐ก Click to View Answer & Explanation
Step-by-step breakdown:
# Function definition
def mystery(a, b=5, c=10):
# a: required parameter
# b: optional, default = 5
# c: optional, default = 10
return a * 2 + b - c
# Function call: mystery(3, c=4)
# a = 3 (positional argument, first position)
# b = 5 (uses default value, NOT provided in call)
# c = 4 (keyword argument, overrides default of 10)
# Calculation:
# a * 2 + b - c
# = 3 * 2 + 5 - 4
# = 6 + 5 - 4
# = 7Answer: 7
Key concept: Keyword arguments (c=4) allow you to skip over parameters with defaults. Here b uses its default value of 5.
Q1.4 (5 points)
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
total = 0
for key in data:
total += sum(data[key])
print(total)๐ก Click to View Answer & Explanation
Step-by-step breakdown:
# Dictionary with lists as values
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
total = 0
# Iterating over a dictionary gives KEYS, not values
for key in data: # key will be 'A', then 'B'
# First iteration: key = 'A'
# data['A'] = [1, 2, 3]
# sum([1, 2, 3]) = 6
# total = 0 + 6 = 6
# Second iteration: key = 'B'
# data['B'] = [4, 5, 6]
# sum([4, 5, 6]) = 15
# total = 6 + 15 = 21
total += sum(data[key])
print(total) # 21Answer: 21
Calculation summary:
- sum([1, 2, 3]) = 6
- sum([4, 5, 6]) = 15
- Total = 6 + 15 = 21
Question 2: Code Writing (30 points)
Choose 3 out of 5 questions. Each worth 10 points.
Q2.1 - Grade Calculator (10 points)
Write a function grade_calculator(score) that:
- Returns letter grade: 90+ โ "A", 80+ โ "B", 70+ โ "C", 60+ โ "D", <60 โ "F"
- Returns "Invalid" for scores < 0 or > 100
๐ก Click to View Verified Answer
def grade_calculator(score):
"""
Convert numeric score to letter grade.
Args:
score: Numeric score (expected range: 0-100)
Returns:
str: Letter grade (A/B/C/D/F) or "Invalid" for out-of-range scores
Examples:
>>> grade_calculator(95)
'A'
>>> grade_calculator(-5)
'Invalid'
"""
# STEP 1: Validate input range FIRST
# Must check invalid cases before checking grade ranges
if score < 0 or score > 100:
return "Invalid"
# STEP 2: Check grades from highest to lowest
# Using elif ensures only one condition matches
if score >= 90:
return "A" # 90-100
elif score >= 80:
return "B" # 80-89
elif score >= 70:
return "C" # 70-79
elif score >= 60:
return "D" # 60-69
else:
return "F" # 0-59
# ===== Test Cases =====
if __name__ == "__main__":
test_cases = [
(95, "A"),
(85, "B"),
(73, "C"),
(65, "D"),
(45, "F"),
(-5, "Invalid"),
(105, "Invalid"),
(100, "A"), # Edge case: exactly 100
(0, "F"), # Edge case: exactly 0
]
print("Testing grade_calculator:")
for score, expected in test_cases:
result = grade_calculator(score)
status = "โ" if result == expected else "โ"
print(f" {status} grade_calculator({score}) = {result} (expected {expected})")Test Output:
Testing grade_calculator:
โ grade_calculator(95) = A (expected A)
โ grade_calculator(85) = B (expected B)
โ grade_calculator(73) = C (expected C)
โ grade_calculator(65) = D (expected D)
โ grade_calculator(45) = F (expected F)
โ grade_calculator(-5) = Invalid (expected Invalid)
โ grade_calculator(105) = Invalid (expected Invalid)
โ grade_calculator(100) = A (expected A)
โ grade_calculator(0) = F (expected F)
Common mistakes to avoid:
- Not validating input range first
- Using multiple
ifstatements instead ofelif - Checking in wrong order (e.g., 60+ before 90+)
Q2.2 - Remove Duplicates (10 points)
Write a function remove_duplicates(lst) that:
- Removes duplicates from a list
- Preserves the order of first occurrence
- Example:
[1, 2, 2, 3, 1, 4]โ[1, 2, 3, 4]
๐ก Click to View Verified Answer
def remove_duplicates(lst):
"""
Remove duplicate elements while preserving order of first occurrence.
Args:
lst: Input list with possible duplicates
Returns:
list: New list with duplicates removed, order preserved
Examples:
>>> remove_duplicates([1, 2, 2, 3, 1, 4])
[1, 2, 3, 4]
"""
# Track elements we've already seen
seen = []
# Iterate through original list
for item in lst:
# Only add to result if not seen before
if item not in seen:
seen.append(item)
return seen
# Alternative approach using dictionary (Python 3.7+ preserves order)
def remove_duplicates_v2(lst):
"""
Remove duplicates using dict.fromkeys() - more efficient for large lists.
Works because dictionaries preserve insertion order in Python 3.7+.
"""
return list(dict.fromkeys(lst))
# ===== Test Cases =====
if __name__ == "__main__":
test_cases = [
[1, 2, 2, 3, 1, 4], # Basic case
[5, 5, 5, 5], # All duplicates
[1, 2, 3, 4], # No duplicates
[], # Empty list
['a', 'b', 'a', 'c'], # Strings
]
print("Testing remove_duplicates:")
for test in test_cases:
result = remove_duplicates(test)
print(f" {test} โ {result}")Test Output:
Testing remove_duplicates:
[1, 2, 2, 3, 1, 4] โ [1, 2, 3, 4]
[5, 5, 5, 5] โ [5]
[1, 2, 3, 4] โ [1, 2, 3, 4]
[] โ []
['a', 'b', 'a', 'c'] โ ['a', 'b', 'c']
Why not use set()? Sets don't preserve order! list(set([1, 2, 2, 3, 1, 4])) might give [1, 2, 3, 4] but order is NOT guaranteed.
Q2.3 - Fibonacci Sequence (10 points)
Write a function fibonacci(n) that:
- Returns the first n Fibonacci numbers as a list
- Sequence: 0, 1, 1, 2, 3, 5, 8, 13...
๐ก Click to View Verified Answer
def fibonacci(n):
"""
Generate the first n Fibonacci numbers.
Fibonacci sequence: Each number is the sum of the two preceding ones.
Starts with 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ...
Args:
n: Number of Fibonacci numbers to generate (non-negative integer)
Returns:
list: First n Fibonacci numbers
Examples:
>>> fibonacci(5)
[0, 1, 1, 2, 3]
>>> fibonacci(0)
[]
"""
# Handle edge cases
if n <= 0:
return [] # No numbers requested
if n == 1:
return [0] # Only first number
# Initialize with first two Fibonacci numbers
result = [0, 1]
# Generate remaining numbers
for i in range(2, n):
# Each new number = sum of last two numbers
# Using negative indexing: result[-1] is last, result[-2] is second-to-last
next_num = result[-1] + result[-2]
result.append(next_num)
return result
# ===== Test Cases =====
if __name__ == "__main__":
test_cases = [0, 1, 2, 5, 8, 10]
print("Testing fibonacci:")
for n in test_cases:
result = fibonacci(n)
print(f" fibonacci({n}) = {result}")Test Output:
Testing fibonacci:
fibonacci(0) = []
fibonacci(1) = [0]
fibonacci(2) = [0, 1]
fibonacci(5) = [0, 1, 1, 2, 3]
fibonacci(8) = [0, 1, 1, 2, 3, 5, 8, 13]
fibonacci(10) = [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
How it works:
Position: 0 1 2 3 4 5 6 7
Value: 0 1 1 2 3 5 8 13
โ โ
0+1=1 1+1=2 1+2=3 2+3=5 3+5=8
Q2.4 - Prime Number Check (10 points)
Write a function is_prime(num) that:
- Returns True if num is prime, False otherwise
- Handle edge cases (num < 2)
๐ก Click to View Verified Answer
def is_prime(num):
"""
Check if a number is prime.
A prime number is a natural number greater than 1 that has no positive
divisors other than 1 and itself.
Args:
num: Integer to check
Returns:
bool: True if prime, False otherwise
Examples:
>>> is_prime(7)
True
>>> is_prime(12)
False
"""
# Numbers less than 2 are not prime by definition
# This handles 0, 1, and negative numbers
if num < 2:
return False
# 2 is the only even prime number
if num == 2:
return True
# All other even numbers are not prime
# (They're divisible by 2)
if num % 2 == 0:
return False
# Check odd divisors from 3 up to โnum
# Why โnum? If num = a ร b, at least one of a,b must be โค โnum
# If no divisor found up to โnum, num is prime
for i in range(3, int(num ** 0.5) + 1, 2): # Step by 2 (odd numbers only)
if num % i == 0:
return False # Found a divisor, not prime
return True # No divisors found, it's prime
# ===== Test Cases =====
if __name__ == "__main__":
test_cases = [
(1, False), # Not prime (less than 2)
(2, True), # Prime (smallest prime)
(3, True), # Prime
(4, False), # Not prime (2 ร 2)
(7, True), # Prime
(9, False), # Not prime (3 ร 3)
(11, True), # Prime
(25, False), # Not prime (5 ร 5)
(29, True), # Prime
(-5, False), # Negative, not prime
]
print("Testing is_prime:")
for num, expected in test_cases:
result = is_prime(num)
status = "โ" if result == expected else "โ"
print(f" {status} is_prime({num}) = {result}")Test Output:
Testing is_prime:
โ is_prime(1) = False
โ is_prime(2) = True
โ is_prime(3) = True
โ is_prime(4) = False
โ is_prime(7) = True
โ is_prime(9) = False
โ is_prime(11) = True
โ is_prime(25) = False
โ is_prime(29) = True
โ is_prime(-5) = False
Optimization: Checking up to โn instead of n reduces time complexity from O(n) to O(โn).
Q2.5 - Tuple Statistics (10 points)
Write a function tuple_stats(data) that:
- Input: tuple of numbers
- Return: tuple of (min, max, average rounded to 2 decimals)
๐ก Click to View Verified Answer
def tuple_stats(data):
"""
Calculate statistics for a tuple of numbers.
Args:
data: Tuple of numeric values
Returns:
tuple: (minimum, maximum, average) where average is rounded to 2 decimals
Raises:
ValueError: If tuple is empty
Examples:
>>> tuple_stats((10, 20, 30, 40))
(10, 40, 25.0)
"""
# Handle empty tuple edge case
if len(data) == 0:
raise ValueError("Cannot compute stats for empty tuple")
# Calculate statistics using built-in functions
minimum = min(data) # Smallest value
maximum = max(data) # Largest value
average = sum(data) / len(data) # Arithmetic mean
# Round average to 2 decimal places
average = round(average, 2)
# Return as tuple (note: using parentheses to make it clear)
return (minimum, maximum, average)
# ===== Test Cases =====
if __name__ == "__main__":
test_cases = [
(10, 20, 30, 40), # Even spread
(5, 15, 25), # Odd count
(7,), # Single element
(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), # Larger tuple
]
print("Testing tuple_stats:")
for data in test_cases:
result = tuple_stats(data)
print(f" tuple_stats({data})")
print(f" โ (min={result[0]}, max={result[1]}, avg={result[2]})")Test Output:
Testing tuple_stats:
tuple_stats((10, 20, 30, 40))
โ (min=10, max=40, avg=25.0)
tuple_stats((5, 15, 25))
โ (min=5, max=25, avg=15.0)
tuple_stats((7,))
โ (min=7, max=7, avg=7.0)
tuple_stats((1, 2, 3, 4, 5, 6, 7, 8, 9, 10))
โ (min=1, max=10, avg=5.5)
Question 3: Pandas & ML Basics (25 points)
Part A: Theory (10 points)
Q3.A1 (5 points) Explain the difference between fit() and predict() in scikit-learn.
๐ก Click to View Answer
| Method | Purpose | When Called | What It Does |
|---|---|---|---|
fit() |
Train the model | Once, on training data | Learns patterns/parameters from data |
predict() |
Use the model | On test/new data | Applies learned patterns to make predictions |
Workflow example:
# Step 1: Create model
model = DecisionTreeClassifier()
# Step 2: Train model (learn from training data)
model.fit(X_train, y_train) # Learns patterns
# Step 3: Use model (apply to new data)
predictions = model.predict(X_test) # Makes predictionsAnalogy:
fit()= studying for an exampredict()= taking the exam
Q3.A2 (5 points) Why do we need train-test split? Why not use all data for training?
๐ก Click to View Answer
Why train-test split is essential:
-
Evaluate on unseen data: We need to test how the model performs on data it hasn't seen during training.
-
Detect overfitting: If we train and test on the same data, the model might just memorize the answers (overfitting). Train-test split reveals if the model generalizes well.
-
Simulate real-world usage: In production, the model will encounter new, unseen data. Testing on held-out data simulates this.
-
Get honest performance estimate: Training accuracy is often misleadingly high; test accuracy gives a realistic measure.
What happens without split:
- Model could achieve 100% accuracy on training data
- But fail completely on new data
- No way to detect this problem until deployment
Typical split ratios:
- 80/20 (training/test)
- 70/30 (training/test)
Part B: Pandas Code (15 points)
Given this DataFrame:
import pandas as pd
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
'Age': [25, 30, 35, None, 28],
'Salary': [50000, 60000, 75000, 65000, None],
'Department': ['IT', 'HR', 'IT', 'Finance', 'HR']
}
df = pd.DataFrame(data)Q3.B1 (5 points) Fill missing Age with mean, missing Salary with 55000.
๐ก Click to View Verified Answer
import pandas as pd
# Create the DataFrame
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
'Age': [25, 30, 35, None, 28],
'Salary': [50000, 60000, 75000, 65000, None],
'Department': ['IT', 'HR', 'IT', 'Finance', 'HR']
}
df = pd.DataFrame(data)
print("Before filling:")
print(df)
print()
# Method 1: Using fillna with inplace=True
# Fill missing Age with mean
age_mean = df['Age'].mean() # Calculate mean first (ignores NaN)
print(f"Age mean (excluding NaN): {age_mean}") # = 29.5
df['Age'].fillna(age_mean, inplace=True)
# Fill missing Salary with 55000
df['Salary'].fillna(55000, inplace=True)
print("\nAfter filling:")
print(df)Output:
Before filling:
Name Age Salary Department
0 Alice 25.0 50000.0 IT
1 Bob 30.0 60000.0 HR
2 Charlie 35.0 75000.0 IT
3 David NaN 65000.0 Finance
4 Eve 28.0 NaN HR
Age mean (excluding NaN): 29.5
After filling:
Name Age Salary Department
0 Alice 25.0 50000.0 IT
1 Bob 30.0 60000.0 HR
2 Charlie 35.0 75000.0 IT
3 David 29.5 65000.0 Finance
4 Eve 28.0 55000.0 HR
Note: mean() automatically ignores NaN values when calculating.
Q3.B2 (5 points) Calculate average salary by Department.
๐ก Click to View Verified Answer
# Group by Department and calculate mean of Salary
avg_salary_by_dept = df.groupby('Department')['Salary'].mean()
print("Average Salary by Department:")
print(avg_salary_by_dept)Output (after filling missing values):
Average Salary by Department:
Department
Finance 65000.0
HR 57500.0
IT 62500.0
Name: Salary, dtype: float64
Calculation breakdown:
- Finance: 65000 (only David)
- HR: (60000 + 55000) / 2 = 57500 (Bob + Eve)
- IT: (50000 + 75000) / 2 = 62500 (Alice + Charlie)
Q3.B3 (5 points) Select IT employees with Age > 26.
๐ก Click to View Verified Answer
# Filter with multiple conditions
# IMPORTANT: Use & for AND, | for OR
# IMPORTANT: Wrap each condition in parentheses
result = df[(df['Department'] == 'IT') & (df['Age'] > 26)]
print("IT employees with Age > 26:")
print(result)Output:
IT employees with Age > 26:
Name Age Salary Department
2 Charlie 35.0 75000.0 IT
Syntax rules for pandas filtering:
- Use
&instead ofand - Use
|instead ofor - Use
~instead ofnot - Wrap each condition in parentheses
Wrong: df[df['A'] == 1 and df['B'] == 2]
Correct: df[(df['A'] == 1) & (df['B'] == 2)]
Question 4: Decision Tree & Naive Bayes (25 points)
Part A: Theory (10 points)
Q4.A1 (5 points) List THREE advantages of Decision Trees.
๐ก Click to View Answer
-
Easy to interpret and visualize
- Can draw the tree and follow decision paths
- Non-technical stakeholders can understand the logic
- "If-then" rules are intuitive
-
No feature scaling required
- Works directly with raw data values
- Unlike SVM or KNN, doesn't need normalization
- Saves preprocessing time
-
Handles both numerical and categorical data
- Can split on continuous values (Age > 30)
- Can split on categories (Color == 'Red')
- Versatile for mixed datasets
-
Captures non-linear relationships
- Can model complex decision boundaries
- Doesn't assume linear separability
-
Shows feature importance
- Reveals which features matter most
- Helps with feature selection
Q4.A2 (5 points) Gini Index vs Information Gain. Which does CART use?
๐ก Click to View Answer
| Metric | Formula | Range (binary) | Used By |
|---|---|---|---|
| Gini Index | 1 - ฮฃ(pแตขยฒ) | 0 to 0.5 | CART |
| Entropy/Information Gain | -ฮฃ(pแตข logโ pแตข) | 0 to 1 | ID3, C4.5 |
CART (Classification and Regression Trees) uses Gini Index.
Why Gini?
- Faster to compute (no logarithm)
- Similar results to entropy in practice
- Slightly favors larger partitions
Interpretation:
- Gini = 0 โ Pure node (all same class)
- Gini = 0.5 โ Maximum impurity (50/50 split)
Part B: Gini Calculation (15 points)
Scenario: Email classification with 20 emails (12 Spam, 8 Not Spam)
Split 1 - "Contains free":
- Contains "free": 10 emails (9 Spam, 1 Not Spam)
- No "free": 10 emails (3 Spam, 7 Not Spam)
Split 2 - "Contains meeting":
- Contains "meeting": 8 emails (2 Spam, 6 Not Spam)
- No "meeting": 12 emails (10 Spam, 2 Not Spam)
Q4.B1 (8 points) Calculate Gini Index for Split 1.
๐ก Click to View Verified Answer
Formula: Gini = 1 - ฮฃ(pแตขยฒ)
Step 1: Gini for "Contains free" node (10 emails: 9 Spam, 1 Not Spam)
P(Spam) = 9/10 = 0.9
P(Not Spam) = 1/10 = 0.1
Gini = 1 - (0.9ยฒ + 0.1ยฒ)
= 1 - (0.81 + 0.01)
= 1 - 0.82
= 0.18
Step 2: Gini for "No free" node (10 emails: 3 Spam, 7 Not Spam)
P(Spam) = 3/10 = 0.3
P(Not Spam) = 7/10 = 0.7
Gini = 1 - (0.3ยฒ + 0.7ยฒ)
= 1 - (0.09 + 0.49)
= 1 - 0.58
= 0.42
Step 3: Weighted Average Gini
Gini(Split 1) = (10/20) ร 0.18 + (10/20) ร 0.42
= 0.5 ร 0.18 + 0.5 ร 0.42
= 0.09 + 0.21
= 0.30
Answer: Split 1 Gini = 0.30
Q4.B2 (7 points) Calculate Gini Index for Split 2. Which split is better?
๐ก Click to View Verified Answer
Step 1: Gini for "Contains meeting" node (8 emails: 2 Spam, 6 Not Spam)
P(Spam) = 2/8 = 0.25
P(Not Spam) = 6/8 = 0.75
Gini = 1 - (0.25ยฒ + 0.75ยฒ)
= 1 - (0.0625 + 0.5625)
= 1 - 0.625
= 0.375
Step 2: Gini for "No meeting" node (12 emails: 10 Spam, 2 Not Spam)
P(Spam) = 10/12 = 0.833
P(Not Spam) = 2/12 = 0.167
Gini = 1 - (0.833ยฒ + 0.167ยฒ)
= 1 - (0.694 + 0.028)
= 1 - 0.722
= 0.278
Step 3: Weighted Average Gini
Gini(Split 2) = (8/20) ร 0.375 + (12/20) ร 0.278
= 0.4 ร 0.375 + 0.6 ร 0.278
= 0.15 + 0.167
= 0.317
Answer: Split 2 Gini = 0.317
Comparison:
| Split | Gini Index |
|---|---|
| Split 1 ("free") | 0.30 โ Better |
| Split 2 ("meeting") | 0.317 |
Better split: Split 1 ("Contains free")
Reason: Lower Gini = Lower impurity = Better separation of classes
๐ End of Exam
| Question | Topic | Points |
|---|---|---|
| Q1 | Python Output Analysis | 20 |
| Q2 | Code Writing (choose 3/5) | 30 |
| Q3 | Pandas & ML Theory | 25 |
| Q4 | Decision Tree & Gini | 25 |
| Total | 100 |
๐ Key Formulas Reference
| Concept | Formula |
|---|---|
| Gini Index | 1 - ฮฃ(pแตขยฒ) |
| Entropy | -ฮฃ pแตข logโ(pแตข) |
| Info Gain | H(parent) - ฮฃ weighted H(children) |
| Bayes | P(A|B) โ P(B|A) ร P(A) |
| Z-score | (x - ฮผ) / ฯ |
| MinMax | (x - min) / (max - min) |
All code verified and tested. Show your work for partial credit. Good luck!
ABW505 Mock Exam 2 - Python & Machine Learning
๐ Exam Information
| Item | Details |
|---|---|
| Total Points | 100 |
| Time Allowed | 90 minutes |
| Format | Closed book, calculator allowed |
| Structure | Q1 (20pts) + Q2 (30pts, choose 3/5) + Q3 (25pts) + Q4 (25pts) |
Question 1: Python Output Analysis (20 points)
Answer ALL. Determine exact output.
Q1.1 (5 points)
a = [1, 2, 3]
b = a
b.append(4)
print(a)
print(a is b)๐ก Click to View Answer & Explanation
Step-by-step breakdown:
# Step 1: Create a list and assign to variable 'a'
a = [1, 2, 3]
# Memory: a points to list object [1, 2, 3]
# Step 2: Assign 'a' to 'b'
b = a
# IMPORTANT: This does NOT copy the list!
# Both 'a' and 'b' now point to the SAME list object in memory
# Memory: a โ [1, 2, 3] โ b
# Step 3: Modify list through 'b'
b.append(4)
# Since a and b point to the same object,
# the change is visible through both variables
# Memory: a โ [1, 2, 3, 4] โ b
# Step 4: Print results
print(a) # [1, 2, 3, 4] - modified through b
print(a is b) # True - same object in memoryAnswers:
print(a)โ[1, 2, 3, 4]print(a is b)โTrue
Key concept: In Python, assignment creates a REFERENCE, not a copy.
To create an independent copy:
b = a.copy() # Method 1: copy() method
b = a[:] # Method 2: slice notation
b = list(a) # Method 3: list constructorQ1.2 (5 points)
text = "Hello World"
print(text[0:5:2])
print(text[-5:-1])๐ก Click to View Answer & Explanation
Step-by-step breakdown:
text = "Hello World"
# Index map:
# Character: H e l l o W o r l d
# Positive: 0 1 2 3 4 5 6 7 8 9 10
# Negative:-11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1
# Line 1: text[0:5:2]
# Format: [start:stop:step]
# start=0 (H), stop=5 (exclusive), step=2 (every 2nd char)
# Indices: 0, 2, 4 โ Characters: 'H', 'l', 'o'
result1 = text[0:5:2] # "Hlo"
# Line 2: text[-5:-1]
# start=-5 (W), stop=-1 (exclusive, before 'd')
# Indices: -5, -4, -3, -2 โ Characters: 'W', 'o', 'r', 'l'
result2 = text[-5:-1] # "Worl"Answers:
text[0:5:2]โHlotext[-5:-1]โWorl
Slicing syntax: [start:stop:step]
start: inclusive (default: 0)stop: exclusive (default: end)step: increment (default: 1)
Q1.3 (5 points)
def outer():
x = 10
def inner():
nonlocal x
x += 5
return x
return inner()
print(outer())
print(outer())๐ก Click to View Answer & Explanation
Step-by-step breakdown:
def outer():
x = 10 # Local variable in outer's scope
def inner():
nonlocal x # Refers to x in enclosing (outer) scope
x += 5 # Modify outer's x: 10 + 5 = 15
return x # Return 15
return inner() # Call inner() and return its result
# First call: outer()
# - x starts at 10 in a NEW local scope
# - inner() adds 5: x = 15
# - Returns 15
print(outer()) # 15
# Second call: outer()
# - FRESH call creates NEW local scope
# - x starts at 10 again (not preserved from first call)
# - inner() adds 5: x = 15
# - Returns 15
print(outer()) # 15Answers:
- First
print(outer())โ15 - Second
print(outer())โ15
Key concepts:
nonlocalmodifies variable in enclosing scope (not global)- Each call to
outer()creates a fresh local scope - Variable
xis NOT preserved between calls
Q1.4 (5 points)
nums = [1, 2, 3, 4, 5]
result = [x**2 for x in nums if x % 2 == 1]
print(result)
print(sum(result))๐ก Click to View Answer & Explanation
Step-by-step breakdown:
nums = [1, 2, 3, 4, 5]
# List comprehension with filter
# Pattern: [expression for item in iterable if condition]
result = [x**2 for x in nums if x % 2 == 1]
# Step-by-step execution:
# x=1: 1 % 2 == 1? True โ 1**2 = 1 โ include
# x=2: 2 % 2 == 1? False โ skip
# x=3: 3 % 2 == 1? True โ 3**2 = 9 โ include
# x=4: 4 % 2 == 1? False โ skip
# x=5: 5 % 2 == 1? True โ 5**2 = 25 โ include
# Result: [1, 9, 25]
print(result) # [1, 9, 25]
print(sum(result)) # 1 + 9 + 25 = 35Answers:
print(result)โ[1, 9, 25]print(sum(result))โ35
Breakdown:
- Filter: odd numbers only (1, 3, 5)
- Transform: square each (1, 9, 25)
- Sum: 1 + 9 + 25 = 35
Question 2: Code Writing (30 points)
Choose 3 out of 5 questions. Each worth 10 points.
Q2.1 - Count Vowels (10 points)
Write a function count_vowels(text) that:
- Counts vowels (a, e, i, o, u) - case insensitive
- Returns the count as an integer
๐ก Click to View Verified Answer
def count_vowels(text):
"""
Count the number of vowels in a string.
Vowels are: a, e, i, o, u (case insensitive)
Args:
text: Input string to analyze
Returns:
int: Number of vowels found
Examples:
>>> count_vowels("Hello World")
3
>>> count_vowels("AEIOU")
5
"""
# Define vowels (both cases for easy comparison)
vowels = "aeiouAEIOU"
# Initialize counter
count = 0
# Iterate through each character in the text
for char in text:
# Check if character is a vowel
if char in vowels:
count += 1
return count
# Alternative: More Pythonic one-liner
def count_vowels_v2(text):
"""One-liner using generator expression and sum."""
return sum(1 for char in text.lower() if char in 'aeiou')
# Alternative: Using count method
def count_vowels_v3(text):
"""Using str.count() for each vowel."""
text_lower = text.lower()
return sum(text_lower.count(v) for v in 'aeiou')
# ===== Test Cases =====
if __name__ == "__main__":
test_cases = [
("Hello World", 3), # e, o, o
("AEIOU", 5), # all uppercase vowels
("rhythm", 0), # no vowels
("", 0), # empty string
("AaEeIiOoUu", 10), # mixed case
]
print("Testing count_vowels:")
for text, expected in test_cases:
result = count_vowels(text)
status = "โ" if result == expected else "โ"
print(f' {status} count_vowels("{text}") = {result} (expected {expected})')Test Output:
Testing count_vowels:
โ count_vowels("Hello World") = 3 (expected 3)
โ count_vowels("AEIOU") = 5 (expected 5)
โ count_vowels("rhythm") = 0 (expected 0)
โ count_vowels("") = 0 (expected 0)
โ count_vowels("AaEeIiOoUu") = 10 (expected 10)
Key points:
- Handle both uppercase and lowercase
- Use
inoperator for membership test - Simple counter pattern
Q2.2 - Word Frequency Dictionary (10 points)
Write a function word_frequency(words) that:
- Input: list of words
- Return: dictionary with word counts
- Example:
['a', 'b', 'a']โ{'a': 2, 'b': 1}
๐ก Click to View Verified Answer
def word_frequency(words):
"""
Count frequency of each word in a list.
Args:
words: List of words (strings)
Returns:
dict: Dictionary mapping each word to its count
Examples:
>>> word_frequency(['a', 'b', 'a'])
{'a': 2, 'b': 1}
"""
# Initialize empty frequency dictionary
freq = {}
# Count each word
for word in words:
if word in freq:
# Word seen before - increment count
freq[word] += 1
else:
# First occurrence - initialize count to 1
freq[word] = 1
return freq
# Alternative: Using dict.get()
def word_frequency_v2(words):
"""Using get() method to simplify logic."""
freq = {}
for word in words:
# get(key, default) returns default if key doesn't exist
freq[word] = freq.get(word, 0) + 1
return freq
# Alternative: Using collections.Counter
def word_frequency_v3(words):
"""Using Counter from collections (most Pythonic)."""
from collections import Counter
return dict(Counter(words))
# ===== Test Cases =====
if __name__ == "__main__":
test_cases = [
['a', 'b', 'a'],
['hello', 'world', 'hello', 'hello'],
[],
['single'],
]
print("Testing word_frequency:")
for words in test_cases:
result = word_frequency(words)
print(f" {words} โ {result}")Test Output:
Testing word_frequency:
['a', 'b', 'a'] โ {'a': 2, 'b': 1}
['hello', 'world', 'hello', 'hello'] โ {'hello': 3, 'world': 1}
[] โ {}
['single'] โ {'single': 1}
Key techniques:
- Check if key exists before incrementing
- Alternative: use
dict.get(key, default) - Best practice: use
collections.Counter
Q2.3 - Multiplication Table (10 points)
Write a function multiplication_table(n) that:
- Prints an nรn multiplication table
- Format:
1 x 1 = 1
๐ก Click to View Verified Answer
def multiplication_table(n):
"""
Print an nรn multiplication table.
Args:
n: Size of the table (positive integer)
Example output for n=3:
1 x 1 = 1
1 x 2 = 2
1 x 3 = 3
2 x 1 = 2
...
3 x 3 = 9
"""
# Validate input
if n <= 0:
print("Please provide a positive integer.")
return
# Outer loop: rows (first multiplier)
for i in range(1, n + 1):
# Inner loop: columns (second multiplier)
for j in range(1, n + 1):
# Calculate product
product = i * j
# Print formatted result using f-string
print(f"{i} x {j} = {product}")
# Alternative: Compact table format
def multiplication_table_compact(n):
"""Print table in grid format."""
for i in range(1, n + 1):
row = ""
for j in range(1, n + 1):
row += f"{i*j:4}" # 4-character width for alignment
print(row)
# ===== Test Cases =====
if __name__ == "__main__":
print("3x3 Multiplication Table:")
print("-" * 20)
multiplication_table(3)
print("\n3x3 Compact Format:")
print("-" * 20)
multiplication_table_compact(3)Test Output:
3x3 Multiplication Table:
--------------------
1 x 1 = 1
1 x 2 = 2
1 x 3 = 3
2 x 1 = 2
2 x 2 = 4
2 x 3 = 6
3 x 1 = 3
3 x 2 = 6
3 x 3 = 9
3x3 Compact Format:
--------------------
1 2 3
2 4 6
3 6 9
Key concepts:
- Nested loops for 2D iteration
range(1, n+1)to start from 1- f-strings for formatted output
Q2.4 - Find Max and Min (10 points)
Write a function find_max_min(numbers) that:
- Input: list of numbers
- Return: tuple (maximum, minimum, difference)
- Handle empty list by returning (None, None, None)
๐ก Click to View Verified Answer
def find_max_min(numbers):
"""
Find maximum, minimum, and their difference in a list.
Args:
numbers: List of numeric values
Returns:
tuple: (maximum, minimum, difference) or (None, None, None) if empty
Examples:
>>> find_max_min([5, 2, 8, 1, 9])
(9, 1, 8)
>>> find_max_min([])
(None, None, None)
"""
# Handle empty list edge case
# IMPORTANT: Check this first to avoid errors with min()/max()
if not numbers: # Empty list is falsy in Python
return (None, None, None)
# Find maximum and minimum using built-in functions
maximum = max(numbers)
minimum = min(numbers)
# Calculate difference (range of values)
difference = maximum - minimum
return (maximum, minimum, difference)
# Alternative: Without using built-in min/max
def find_max_min_manual(numbers):
"""Manual implementation without min()/max()."""
if not numbers:
return (None, None, None)
# Initialize with first element
maximum = numbers[0]
minimum = numbers[0]
# Iterate through remaining elements
for num in numbers[1:]:
if num > maximum:
maximum = num
if num < minimum:
minimum = num
return (maximum, minimum, maximum - minimum)
# ===== Test Cases =====
if __name__ == "__main__":
test_cases = [
[5, 2, 8, 1, 9], # Normal case
[3], # Single element
[], # Empty list
[-5, -2, -8, -1], # Negative numbers
[1, 1, 1, 1], # All same
]
print("Testing find_max_min:")
for nums in test_cases:
result = find_max_min(nums)
print(f" {nums} โ max={result[0]}, min={result[1]}, diff={result[2]}")Test Output:
Testing find_max_min:
[5, 2, 8, 1, 9] โ max=9, min=1, diff=8
[3] โ max=3, min=3, diff=0
[] โ max=None, min=None, diff=None
[-5, -2, -8, -1] โ max=-1, min=-8, diff=7
[1, 1, 1, 1] โ max=1, min=1, diff=0
Key points:
- ALWAYS handle empty list first
- Use built-in
min()andmax()for efficiency - Return a tuple, not a list
Q2.5 - Factorial (Recursive) (10 points)
Write a function factorial(n) that:
- Calculates n! recursively
- Handle: 0! = 1, negative returns None
๐ก Click to View Verified Answer
def factorial(n):
"""
Calculate factorial of n using recursion.
Factorial definition:
- n! = n ร (n-1) ร (n-2) ร ... ร 2 ร 1
- 0! = 1 (by definition)
- Negative numbers: undefined (return None)
Args:
n: Non-negative integer
Returns:
int: n! or None for negative input
Examples:
>>> factorial(5)
120
>>> factorial(0)
1
"""
# Handle negative input
if n < 0:
return None
# Base case: 0! = 1 and 1! = 1
if n == 0 or n == 1:
return 1
# Recursive case: n! = n ร (n-1)!
return n * factorial(n - 1)
# Trace for factorial(4):
# factorial(4) = 4 ร factorial(3)
# = 4 ร (3 ร factorial(2))
# = 4 ร (3 ร (2 ร factorial(1)))
# = 4 ร (3 ร (2 ร 1))
# = 4 ร (3 ร 2)
# = 4 ร 6
# = 24
# Alternative: Iterative version (no recursion)
def factorial_iterative(n):
"""Calculate factorial using iteration."""
if n < 0:
return None
result = 1
for i in range(2, n + 1):
result *= i
return result
# ===== Test Cases =====
if __name__ == "__main__":
test_cases = [
(0, 1), # 0! = 1
(1, 1), # 1! = 1
(5, 120), # 5! = 120
(10, 3628800),
(-5, None), # Negative
]
print("Testing factorial:")
for n, expected in test_cases:
result = factorial(n)
status = "โ" if result == expected else "โ"
print(f" {status} factorial({n}) = {result} (expected {expected})")Test Output:
Testing factorial:
โ factorial(0) = 1 (expected 1)
โ factorial(1) = 1 (expected 1)
โ factorial(5) = 120 (expected 120)
โ factorial(10) = 3628800 (expected 3628800)
โ factorial(-5) = None (expected None)
Recursion components:
- Base case: stops recursion (n=0 or n=1)
- Recursive case: breaks problem into smaller subproblem
- Progress: n decreases each call, eventually reaching base case
Question 3: Pandas & SVM/Random Forest (25 points)
Part A: Data Preprocessing (15 points)
Given this DataFrame:
import pandas as pd
from sklearn.preprocessing import LabelEncoder, StandardScaler
data = {
'Age': [25, 30, None, 35, 40],
'Income': [30000, 50000, 45000, None, 60000],
'Education': ['High School', 'Bachelor', 'Master', 'PhD', 'Bachelor'],
'Purchased': ['No', 'Yes', 'Yes', 'No', 'Yes']
}
df = pd.DataFrame(data)Q3.A1 (5 points) Fill missing Age with median, missing Income with mean.
๐ก Click to View Verified Answer
import pandas as pd
# Create the DataFrame
data = {
'Age': [25, 30, None, 35, 40],
'Income': [30000, 50000, 45000, None, 60000],
'Education': ['High School', 'Bachelor', 'Master', 'PhD', 'Bachelor'],
'Purchased': ['No', 'Yes', 'Yes', 'No', 'Yes']
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
print()
# Calculate statistics before filling
age_median = df['Age'].median() # Median of [25, 30, 35, 40] = 32.5
income_mean = df['Income'].mean() # Mean of [30000, 50000, 45000, 60000] = 46250
print(f"Age median (excluding NaN): {age_median}")
print(f"Income mean (excluding NaN): {income_mean}")
print()
# Fill missing values
# Method 1: Using fillna with inplace
df['Age'].fillna(age_median, inplace=True)
df['Income'].fillna(income_mean, inplace=True)
# Method 2: Using assignment (alternative)
# df['Age'] = df['Age'].fillna(df['Age'].median())
# df['Income'] = df['Income'].fillna(df['Income'].mean())
print("After filling missing values:")
print(df)Calculations:
- Age values (excluding NaN): [25, 30, 35, 40]
- Age median: (30 + 35) / 2 = 32.5
- Income values (excluding NaN): [30000, 50000, 45000, 60000]
- Income mean: (30000 + 50000 + 45000 + 60000) / 4 = 46250
Result:
- Row 2: Age filled with 32.5
- Row 3: Income filled with 46250.0
Q3.A2 (5 points) Encode 'Education' using LabelEncoder. Show the mapping.
๐ก Click to View Verified Answer
from sklearn.preprocessing import LabelEncoder
# Create LabelEncoder instance
le = LabelEncoder()
# Fit and transform the Education column
df['Education_Encoded'] = le.fit_transform(df['Education'])
print("Encoding result:")
print(df[['Education', 'Education_Encoded']])
print()
# Show the mapping (classes are sorted alphabetically)
print("LabelEncoder mapping:")
for i, label in enumerate(le.classes_):
print(f" '{label}' โ {i}")LabelEncoder sorts alphabetically then assigns 0, 1, 2, ...:
| Original | Encoded |
|---|---|
| Bachelor | 0 |
| High School | 1 |
| Master | 2 |
| PhD | 3 |
Encoded column: [1, 0, 2, 3, 0]
Important: LabelEncoder assigns integers based on alphabetical order, not order of appearance!
Q3.A3 (5 points) When should you use StandardScaler vs MinMaxScaler?
๐ก Click to View Answer
| Scaler | Formula | Output | Best For |
|---|---|---|---|
| StandardScaler | (x - mean) / std | Mean=0, Std=1 | SVM, Logistic Regression, data with outliers |
| MinMaxScaler | (x - min) / (max - min) | [0, 1] | Neural Networks, KNN, image data |
Use StandardScaler when:
- Data is approximately normally distributed
- You want to preserve outlier information
- Using algorithms like SVM, Linear Regression
Use MinMaxScaler when:
- You need bounded output (0 to 1)
- Working with neural networks or image data
- Outliers are not a concern
Quick rule:
- SVM, Linear models โ StandardScaler
- Neural networks, KNN โ MinMaxScaler
Part B: SVM & Random Forest Theory (10 points)
Q3.B1 (5 points) Explain the "kernel trick" in SVM.
๐ก Click to View Answer
Kernel Trick Explanation:
Problem: Some data is not linearly separable in its original space.
Solution: The kernel trick transforms data into a higher-dimensional space where it becomes linearly separable.
How it works:
- Original 2D data might have circular boundaries (can't draw a straight line)
- Transform to 3D using a kernel function
- In 3D, a flat plane can now separate the classes
- The "trick": compute this efficiently without actually computing the transformation
Common kernels:
| Kernel | Use Case |
|---|---|
| Linear | Already linearly separable |
| RBF (Radial Basis Function) | Default choice, works well for most cases |
| Polynomial | Data with polynomial relationships |
Example in code:
from sklearn.svm import SVC
# Linear kernel
model_linear = SVC(kernel='linear')
# RBF kernel (default)
model_rbf = SVC(kernel='rbf')
# Polynomial kernel
model_poly = SVC(kernel='poly', degree=3)Q3.B2 (5 points) What is "bagging" in Random Forest? Why does it help?
๐ก Click to View Answer
Bagging (Bootstrap Aggregating):
Process:
- Create multiple random subsets of training data (with replacement)
- Train a separate decision tree on each subset
- Combine predictions:
- Classification: majority voting
- Regression: average
Why it helps:
- Reduces Overfitting
- Each tree sees different data
- Individual tree errors cancel out
- Ensemble is more robust
- Reduces Variance
- Averaging many predictions is more stable
- Less sensitive to noise in training data
- Handles Outliers Better
- Outliers only affect some trees, not all
- Their influence is diluted in the ensemble
- Better Generalization
- Collective wisdom outperforms single tree
- Works well on unseen data
Analogy: Like asking 100 doctors for diagnosis instead of 1 - the collective opinion is usually more reliable.
Question 4: Naive Bayes & Decision Tree (25 points)
Part A: Naive Bayes Calculation (15 points)
Dataset: Email classification
| Contains "Free" | Contains "Winner" | Spam? | |
|---|---|---|---|
| 1 | Yes | Yes | Spam |
| 2 | Yes | No | Spam |
| 3 | No | Yes | Spam |
| 4 | No | No | Not Spam |
| 5 | Yes | No | Not Spam |
| 6 | No | No | Not Spam |
Q4.A1 (10 points) A new email contains "Free" but not "Winner". Calculate P(Spam|Free=Yes, Winner=No).
๐ก Click to View Verified Answer
Naive Bayes Formula: $P(Class|Features) \propto P(Class) \times \prod P(Feature|Class)$
Step 1: Calculate Prior Probabilities
| Class | Count | P(Class) |
|---|---|---|
| Spam | 3 (emails 1,2,3) | 3/6 = 0.5 |
| Not Spam | 3 (emails 4,5,6) | 3/6 = 0.5 |
Step 2: Calculate Likelihoods
For Spam emails (1, 2, 3):
- P(Free=Yes | Spam) = 2/3 (emails 1, 2 have Free)
- P(Winner=No | Spam) = 1/3 (only email 2 has Winner=No)
For Not Spam emails (4, 5, 6):
- P(Free=Yes | Not Spam) = 1/3 (only email 5)
- P(Winner=No | Not Spam) = 3/3 = 1 (all three)
Step 3: Calculate Unnormalized Posteriors
$P(Spam|evidence) \propto P(Spam) \times P(Free=Yes|Spam) \times P(Winner=No|Spam)$ $= 0.5 \times \frac{2}{3} \times \frac{1}{3} = 0.5 \times 0.667 \times 0.333 = 0.111$
$P(NotSpam|evidence) \propto 0.5 \times \frac{1}{3} \times 1 = 0.167$
Step 4: Normalize
$P(Spam) = \frac{0.111}{0.111 + 0.167} = \frac{0.111}{0.278} = 0.40$
Answer: P(Spam | Free=Yes, Winner=No) = 0.40 = 40%
Prediction: NOT SPAM (probability < 50%)
Q4.A2 (5 points) What is the "naive" assumption in Naive Bayes? When might it fail?
๐ก Click to View Answer
The "Naive" Assumption:
- Features are conditionally independent given the class
- P(A, B | Class) = P(A | Class) ร P(B | Class)
- Each feature contributes independently to the prediction
When it fails:
- Correlated features
- Example: "Free" and "Prize" often appear together in spam
- Treating them as independent overcounts their combined effect
- Redundant features
- Example: Having both "temperature in ยฐC" and "temperature in ยฐF"
- These are perfectly correlated, violating independence
- Feature interactions matter
- Example: Medical diagnosis where symptom combinations are important
- Symptom A alone is harmless, but A+B together indicates disease
Despite this limitation: Naive Bayes often works surprisingly well in practice, especially for:
- Text classification
- Spam detection
- Sentiment analysis
Part B: Information Gain (10 points)
Q4.B1 (10 points) Calculate Information Gain for the "Contains Free" feature.
Original dataset: 3 Spam, 3 Not Spam
๐ก Click to View Verified Answer
Entropy Formula: $H(S) = -\sum p_i \log_2(p_i)$
Step 1: Parent Entropy (3 Spam, 3 Not Spam)
$H(parent) = -0.5 \log_2(0.5) - 0.5 \log_2(0.5)$ $= -0.5 \times (-1) - 0.5 \times (-1)$ $= 0.5 + 0.5 = 1.0$
(Maximum entropy for binary classification = 1.0)
Step 2: Split by "Contains Free"
Free=Yes (3 emails: 2 Spam, 1 Not Spam): $H = -\frac{2}{3} \log_2(\frac{2}{3}) - \frac{1}{3} \log_2(\frac{1}{3})$ $= -0.667 \times (-0.585) - 0.333 \times (-1.585)$ $= 0.390 + 0.528 = 0.918$
Free=No (3 emails: 1 Spam, 2 Not Spam): $H = -\frac{1}{3} \log_2(\frac{1}{3}) - \frac{2}{3} \log_2(\frac{2}{3})$ $= 0.528 + 0.390 = 0.918$
Step 3: Weighted Average Entropy $H(children) = \frac{3}{6} \times 0.918 + \frac{3}{6} \times 0.918 = 0.918$
Step 4: Information Gain $IG = H(parent) - H(children) = 1.0 - 0.918 = 0.082$
Answer: Information Gain = 0.082 bits
Interpretation: "Contains Free" provides a small amount of information for classification. Higher IG would indicate a better split.
๐ End of Exam
| Question | Topic | Points |
|---|---|---|
| Q1 | Python Output Analysis | 20 |
| Q2 | Code Writing (choose 3/5) | 30 |
| Q3 | Pandas & SVM/Random Forest | 25 |
| Q4 | Naive Bayes & Decision Tree | 25 |
| Total | 100 |
๐ Key Formulas Reference
| Concept | Formula |
|---|---|
| Gini Index | 1 - ฮฃ(pแตขยฒ) |
| Entropy | -ฮฃ pแตข logโ(pแตข) |
| Info Gain | H(parent) - ฮฃ weighted H(children) |
| Bayes | P(A|B) โ P(B|A) ร P(A) |
| Z-score | (x - ฮผ) / ฯ |
| MinMax | (x - min) / (max - min) |
All code verified and tested. Show your work for partial credit. Good luck!