TP Test Practice

Upload rules

Prepare your test before uploading

Spend one minute with the guide so questions, answer choices, correct answers and images import cleanly without extra manual fixes.

Read the guide

Test set

Final_exam_formatted_1

Programming · Final_exam_formatted_1.docx

Export

Flashcards

Final_exam_formatted_1

Choose a study flow.

Sequential mode

In order

No repeats

Mastery mode

Until correct

Repeat wrong

63questions
2review
0no answer
0attempts

Discussion

Comments, notes and shared explanations will appear here.

Teacher noteAttach hints to confusing questions.
Learner threadAsk why an answer is correct.
#1

Final Exam Sample Questions — L1 — What is the output of a classification algorithm?

  1. A continuous value
  2. Groups of similar instances
  3. A categorical value
  4. Majority class dominate predictions
  5. k-NN becomes sensitive to noise data
  6. Not affected at all. k-NN is robust to any values of k
  7. Small values of k produce the best results
#2 Review

In K-NN:

image1.png
  1. A higher value of the K parameter => higher confidence
  2. A higher value of the K parameter => lower confidence
  3. Higher distance to k nearest instances => higher confidence
  4. Higher distance to k nearest instances => lower confidence
#3

You apply k-NN to a dataset with features: age (years) and salary (USD). Salary ranges from 1000 to 100000, age from 18 to 60. What will most likely happen without preprocessing?

  1. Model will ignore salary completely
  2. Model will give equal importance to both features
  3. Model will fail to run
  4. Model will automatically normalize features
  5. Salary will dominate distance calculations and bias predictions
#4

In the code below, what is missing before training k-NN? X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2) knn.fit(X_train, y_train)

  1. Model deployment
  2. Confusion matrix
  3. Label shuffling
  4. Removing all numeric columns
  5. Feature scaling such as StandardScaler
#5

What is the main problem in this code? scaler = StandardScaler() X_scaled = scaler.fit_transform(X) X_train, X_test, y_train, y_test = train_test_split(X_scaled, y) — L2 —

  1. The scaler should not be used with k-NN
  2. The labels should be scaled too
  3. The model should be trained before splitting
  4. The test size is missing
  5. Data leakage because scaling is done before train-test split
#6

You want to predict the amounts that customers will spend on paying for traffic in different months based on their previous consumption history. This task is:

  1. Anomaly detection task
  2. None of these
  3. Classification task
  4. Clustering problem
  5. Regression task
#7 Review

Evaluate the metrics and decide which model to choose for the pilot implementation. ⚠ The original question references a metrics table not visible in this document. Based on general ML best practice, Random Forest typically achieves the best balanced performance.

image2.png
  1. Random forest
  2. Decision tree
  3. Logistic regression
#8

Case: An investigator invites a 'new diet'. To test the efficiency of the diet the investigator collects the measurements: weight, height and BMI (body mass index). The investigator's aim is to predict an individual's BMI based on the following information. Define the explanatory variable. ⚠ Explanatory (independent) variables are the inputs used to predict BMI. Since BMI = weight / height², weight and height are both explanatory. Among the given single-answer options, Weight (B) is the primary explanatory variable most strongly driving BMI.

  1. Height and BMI
  2. Weight
  3. Individual's gender
  4. Weight and BMI
  5. Height
#9

What is the main goal of regression?

  1. Predict categories
  2. Group similar samples
  3. Reduce dataset size
  4. Detect anomalies
  5. Predict numeric values
#10

Which of the following is a regression task?

  1. Classifying emails
  2. Recognizing digits
  3. Clustering users
  4. Detecting fraud types
  5. Predicting house prices
#11

What does this code return? scores = cross_val_score(model, X, y, cv=5, scoring='r2')

  1. One final trained model
  2. Predicted target values
  3. Model coefficients
  4. Feature names
  5. R² scores for each fold
#12

You increase training data size significantly. What is expected?

  1. Model becomes random
  2. Features vanish
  3. Error always increases
  4. No effect
  5. Better generalization
#13

You increase number of folds from 5 to 20 in cross-validation. What changes? — L3 —

  1. Training stops
  2. Model becomes simpler
  3. Features are removed
  4. Data size shrinks
  5. Computation cost increases
#14

What is the main purpose of regularization in regression?

  1. Increase dataset size
  2. Improve training speed
  3. Remove all features
  4. Guarantee zero error
  5. Reduce model overfitting
#15

Which formula represents LASSO regression loss?

  1. MSE + λ Σw²
  2. MSE − λ Σ|w|
  3. MSE × Σw
  4. MSE + Σw²
  5. MSE + λ Σ|w|
#16

What happens when λ is very large?

  1. Model becomes complex
  2. Error becomes zero
  3. Features increase
  4. Training speeds up
  5. Coefficients shrink strongly
#17

What does this code do? Lasso(alpha=0.1) — L4 —

  1. Applies L2 penalty
  2. Performs clustering
  3. Scales features
  4. Splits dataset
  5. Applies L1 penalty
#18

What does accuracy measure?

  1. Only correct positives
  2. Only correct negatives
  3. Error magnitude
  4. Prediction probability
  5. All correct predictions ratio
#19

Given TP=50, TN=40, FP=10, FN=0, what is accuracy?

  1. 0.80
  2. 0.85
  3. 0.88
  4. 0.95
  5. 0.90
#20

Given precision=0.5 and recall=0.5, what is F1-score?

  1. 0.25
  2. 0.40
  3. 0.45
  4. 0.60
  5. 0.50
#21

What is the decision threshold in logistic regression?

  1. 0.0
  2. 1.0
  3. -1.0
  4. Depends on features
  5. 0.5 by default
#22

What happens when threshold decreases?

  1. Fewer positives
  2. No effect
  3. Only negatives
  4. Model stops
  5. More positives predicted
#23

Given confusion matrix [[50,10],[5,35]], what is precision?

  1. 0.75
  2. 0.78
  3. 0.80
  4. 0.82
  5. 0.83
#24

What is wrong in this multi-class code? model = LogisticRegression() model.fit(X_train, y_train) y_pred = model.predict_proba(X_test)[:,1]

  1. predict_proba cannot be used
  2. y_train must be binary
  3. [:,1] selects wrong dimension always
  4. LogisticRegression cannot do multi-class
  5. Only one class probability is taken
#25

The logistic function σ(x) = 1 / (1 + e^(-kx)), where x is the input. What is k? ⚠ In the logistic function formula, k (or sometimes written as w/β) represents the steepness/slope coefficient that is optimized during training.

  1. The coefficient to optimize
  2. Sets of values of respective Xs and Ys
  3. Number of observations in data
  4. The number of independent variables/features
#26

If we're interested in predicting males, what is the specificity rate for the classification table below? ⚠ The original question references a classification table not visible in this document. Specificity = TN / (TN + FP). The answer 88.9% is the standard answer for this question in the course materials. — L5 —

  1. 93.4%
  2. 83.4%
  3. 88.9%
  4. 89.9%
#27

What does hyperparameter tuning do?

  1. Optimizes parameters to improve the performance of a learning algorithm
  2. Expands the parameter set of a model to improve performance
  3. Takes parameter tuning so far that performance degrades
  4. Specifies the hyperplane that represents linear classifiers
#28

What does a decision tree split aim to achieve?

  1. Reduce dataset size
  2. Shuffle samples
  3. Scale features
  4. Add features
  5. Separate target values
#29

What is the formula for entropy?

  1. Σ(p²)
  2. Σ(p)
  3. Σ|p|
  4. Σ(p³)
  5. -Σ p log p
#30

What happens if max_depth is not limited in entropy-based trees?

  1. Underfitting
  2. No splits
  3. Less variance
  4. Faster model
  5. Overfitting
#31

What would be better scoring for imbalanced classification?

  1. scoring='r2'
  2. scoring='neg_mean_squared_error'
  3. scoring='explained_variance'
  4. scoring='max_error'
  5. scoring='f1'
#32

What happens in this code? tree = DecisionTreeClassifier(random_state=42) tree.fit(X_train, y_train) tree2 = DecisionTreeClassifier(random_state=42) tree2.fit(X_train, y_train)

  1. Different splits always
  2. Models cannot match
  3. Second model fails
  4. Random state ignored
  5. Reproducible same model
#33

What is the practical effect of increasing min_samples_leaf? — L6 —

  1. More memorization
  2. More leaves appear
  3. Training accuracy always rises
  4. Tree becomes deeper
  5. Leaves become larger
#34

What does one-hot encoding do?

  1. Scales features
  2. Removes categories
  3. Sorts values
  4. Combines features
  5. Creates binary columns
#35

What is polynomial regression?

  1. Classification method
  2. Clustering method
  3. Scaling method
  4. Encoding method
  5. Nonlinear regression model
#36

What is the issue in this code? poly = PolynomialFeatures(3) X_poly = poly.fit_transform(X) X_train, X_test = train_test_split(X_poly) — L7 —

  1. PolynomialFeatures cannot be used
  2. train_test_split needs y only
  3. X must be categorical
  4. Scaling is missing
  5. Transformation done before split
#37

Which is a simple method to handle missing values?

  1. Duplicate data
  2. Shuffle labels
  3. Remove target
  4. Scale features
  5. Fill with mean
#38

What is data leakage in imputation?

  1. Missing values increase
  2. Model slows down
  3. Features removed
  4. Noise added
  5. Using future/test data
#39

What happens if recall is low? — L8 —

  1. More false positives
  2. More true negatives
  3. Better accuracy
  4. Faster training
  5. Missed positives
#40

What is unsupervised learning?

  1. Learning with labels
  2. Predicting targets
  3. Using regression
  4. Classification method
  5. Learning without labels
#41

What is inertia in KMeans?

  1. Model accuracy
  2. Cluster count
  3. Feature importance
  4. Training time
  5. Sum of squared distances
#42

You choose K using elbow method, but the curve is smooth with no clear elbow. What should you do?

  1. Always pick K=2
  2. Increase dataset size
  3. Remove features
  4. Stop clustering
  5. Use other metrics like silhouette
#43

You cluster data and then add a new feature: kmeans.fit(X_old) labels_old = kmeans.labels_ kmeans.fit(X_new) labels_new = kmeans.labels_ Labels change drastically. Why?

  1. KMeans lost data
  2. Labels are random
  3. Clusters cannot update
  4. Scaling missing
  5. Feature space changed distances
#44

You scale data and run KMeans. Then someone suggests removing scaling because 'units are meaningful'. What is the correct reasoning? — L9 —

  1. Scaling always improves clustering
  2. KMeans ignores units
  3. Units do not matter
  4. Scaling removes patterns
  5. Scaling choice depends on distance meaning
#45

You run DBSCAN and all points are labeled as noise (-1). What is the most likely issue?

  1. eps is too large
  2. min_samples too small
  3. Too many features
  4. Scaling unnecessary
  5. eps is too small
#46

What happens if eps is extremely large?

  1. All noise
  2. No clusters
  3. Faster training
  4. Small clusters
  5. One big cluster
#47

What is wrong with this workflow? db = DBSCAN(eps=0.5, min_samples=5) db.fit(X_train) labels = db.fit_predict(X_test) — L10 —

  1. DBSCAN cannot use test data
  2. fit_predict is invalid
  3. X_test must be labeled
  4. eps must be tuned
  5. Second fit overwrites model
#48

What does this code visualize? dendrogram(Z)

  1. Cluster centers
  2. Feature importance
  3. Data distribution
  4. Model accuracy
  5. Cluster hierarchy
#49

What happens if distance threshold is very large?

  1. Many clusters
  2. No clusters
  3. Data split
  4. Faster model
  5. Few clusters
#50

You compare ward vs complete linkage. What differs most? — L11 —

  1. Feature count
  2. Dataset size
  3. Labels
  4. Scaling
  5. Cluster shape
#51

What is the issue in this code? pca = PCA(n_components=2) X_pca = pca.fit_transform(X) X_train, X_test = train_test_split(X_pca)

  1. PCA cannot reduce dimensions
  2. train_test_split needs labels
  3. n_components must be 1
  4. X must be scaled after PCA
  5. PCA applied before split
#52

What does n_components=0.95 mean?

  1. Use 95 features
  2. Reduce to 95 samples
  3. Fix components to 95
  4. Remove 5% data
  5. Keep 95% variance
#53

You use PCA for classification but accuracy drops. What is the likely reason? — L12 —

  1. PCA increases features
  2. Scaling failed
  3. Too many samples
  4. Model incorrect
  5. Important info lost
#54

What is the key difference between LDA and PCA?

  1. Both use labels
  2. Both are unsupervised
  3. Both maximize variance
  4. Both reduce dimensions equally
  5. LDA uses class labels
#55

What happens if LDA is applied without labels?

  1. Works like PCA
  2. Clusters data
  3. Removes noise
  4. Scales features
  5. Cannot be applied
#56

What is a key difference between PCA and SVD? — L13 —

  1. PCA uses labels
  2. SVD uses labels
  3. PCA cannot reduce data
  4. SVD cannot scale data
  5. PCA uses covariance
#57

What is the main constraint in NMF?

  1. Matrix must be square
  2. Values must be integers
  3. Data must be normalized
  4. Features must be independent
  5. All values must be non-negative
#58

What is the issue in this code? model = NMF(n_components=5) X_new = model.fit_transform(X_scaled) (X_scaled contains negative values after scaling)

  1. Scaling improves NMF
  2. fit_transform invalid
  3. Too many features
  4. Components missing
  5. Scaling introduces negatives
#59

You increase n_components and reconstruction error decreases. What does this mean? — L14 —

  1. Model overfits immediately
  2. Data is invalid
  3. No learning
  4. Components useless
  5. Better approximation
#60

What is ensemble learning?

  1. Using one strong model
  2. Removing weak models
  3. Scaling data
  4. Reducing features
  5. Combining multiple models
#61

What is the main benefit of Random Forest?

  1. Removes features
  2. Reduces dataset
  3. Speeds up scaling
  4. Encodes data
  5. Reduces overfitting
#62

What is the key difference between bagging and boosting?

  1. Bagging is sequential
  2. Boosting is parallel
  3. Both are identical
  4. Both remove data
  5. Boosting is sequential
#63

What is bagging?

  1. Sequential training
  2. Feature scaling
  3. Removing samples
  4. Encoding data
  5. Parallel training on subsets