AI for Medical Diagnosis complete course is currently being offered by DeepLearning.AI through Coursera platform and is taught by Pranav Rajpurkar.

AI is transforming the practice of medicine. It’s helping doctors diagnose patients more accurately, make predictions about patients’ future health, and recommend better treatments. As an AI practitioner, you have the opportunity to join in this transformation of modern medicine. If you're already familiar with some of the math and coding behind AI algorithms, and are eager to develop your skills further to tackle challenges in the healthcare industry, then this specialization is for you. No prior medical expertise is required!

AI for Medical Diagnosis Week 1 Quiz Answers

Disease detection with computer vision

Q1. Which of the following is not one of the key challenges for Al diagnostic algorithms that is discussed in the lecture?
• Dataset size
• Inflexible models
• Class imbalance

Q2. You find that your training set has 70% negative examples and 30% positive. Which of the following techniques will NOT help for training this imbalanced dataset?
• Oversampling negative examples
• Oversampling positive examples
• Under sampling negative examples
• Reweighting examples in training loss

Q3. What is the total loss from the normal (non-mass) examples in this example dataset?

Please use the natural logarithm in your calculation. When you use numpy.log, this is using the natural logarithm. Also, to get the total loss, please add up the losses from each ‘normal’ example.

• 1.27
• 279
• 20.00
• -0.4

Q4. What is the typical size of medical image dataset?
•  ~1 to 1 hundred images
•  ~1 million or more images
•  ~ 1 hundred to 1 thousand images
•  ~10 thousand to 100 thousand images

Q5. Which of the following data augmentations would be best to apply?

• None of the above

Q6. Which of the following are valid methods for determining ground truth? Choose all that apply.
• Biopsy
• Confirmation by CT scan
• Consensus voting from a board of doctors

Q7. In what order should the training validation and test sets be sampled?
• Validation, Test, Training
• Test. Validation, Training
• Validation, Training, Test
• Training, Validation, Test

Q8. Why is it bad to have the same patients in both training and test sets?
• Overly optimistic test performance
• Leaves too few images for the test set
• Leaves too few images for the training set
• None of the above

Q9. Let's say you have a relatively small training set (~S thousand images). Which training strategy makes the most sense?
• Retraining the last layer of a pre-trained model
• Retraining all layers of a pre-trained model
• Retraining the first layer of a pre-trained model
• Train a model with randomly initialized weights

Q10. Now let's say you have a very large dataset (~1 million images). Which training strategies will make the most sense?
• Retraining all layers of a pretrained model
• Retraining the last layer of a pretrained model
• Retraining the first layer of a pretrained model
• Training a model with randomly initialized weights.

AI for Medical Diagnosis Week 2 Quiz Answers

Evaluating machine learning models

Q1. What is the sensitivity and specificity of a pneumonia model that always outputs positive?
In other words, the models says that every patient has the disease.
• sensitivity = 1.0, specificity = 1.0
• sensitivity = 1.0, specificity = 0.0
• sensitivity = 0.0, specificity = 1.0
• sensitivity = 0.5, specificity = 0.5

Q2. In some studies, you may have to compute the positive predictive value (PPV) from the sensitivity, specificity and prevalence Given a sensitivity = 0.9, specificity = 0.8, and prevalence = 0.2, what is the PPV (positive predictive value)?

HINT: please check the reading item "Calculating PPV in terms of sensitivity, specificity and prevalence"
• 0.9
• 0.18
• 0.02
• 0.53

Q3. If sensitivity = 0.9, specificity = 0.8, and prevalence = 0.2, then what is the accuracy?

Hint: You can watch the video "Sensitivity, Specificity and Prevalence" to find the equation.
• 0.44
• 0.52
• 0.75
• 0.82

Q4. What is the sensitivity and specificity of a model which randomly assigns a score between 0 and 1 to each example (with equal probability) if we use a threshold of 0.7?
• Sensitivity = 0.5, Specificity = 0.5
• Sensitivity = 0.7. Specificity = 0.3
• Sensitivity = 0.3, Specificity = 0.7
• Not enough information to answer the question.

Q5. What is the PPV and sensitivity associated with the following confusion matrix?

Recall that

Sensitivity = How many actual positives are predicted positive?

• Not enough information is given
• PPV = 0.3, Sensitivity = 0.6
• PPV = 0.4, Sensitivity = 0.2
• PPV = 0.6, Sensitivity = 0.33

Q6. You have a model such that the lowest score for a positive example is higher than the maximum score for a negative example. What is its ROC AUC?

HINT 1: watch the video "Varying the threshold".

HINT 2: draw a number line and choose values for the score that is the lowest prediction for any positive example, and choose another number that is the score for the highest prediction for any negative example. Draw a few circles for "positive examples and a few "x" for the negative examples. What do you notice about the model's ability to identify positive and negative examples?
• 1.0
• 0.82
• 0.52
• Not enough information is given

Q7. For every specificity, as we vary the threshold, the sensitivity of model 1 is at least as high as model 2. Which of the following must be true?
• The ROC of model 2 is higher than model 1
• The accuracy of model 2 is higher than model 1
• The ROC of model 1 is at least as high as model 2
• None of the above

Q8. You want to measure the proportion of people with high blood pressure in a population. You sample 1000 people and find that 55% have high blood pressure with a 90% confidence interval of (50%, 60%). What is the correct interpretation of this result?

• If we repeated this sampling, the middle of the confidence interval would be 55%, 90% of the time
• If you repeated this sampling, the true proportion would be in the confidence interval about 90% of the time
• There is a 5% chance that the true mean is less than 50%
• With 90% probability, the proportion of people with high blood pressure is between 50% and 60%

Q9. One experiment calculates a confidence interval using 1000 samples, and the another computes it using 10000 samples. Which interval do you expect to be tighter (assume they use the normal approximation)?
• 10,000 samples
• Cannot say with confidence
• 1,000 samples
• Not enough information

AI for Medical Diagnosis Week 3 Quiz Answers

Segmentation on medical images

Q1. Which of the following is a segmentation task?
• Determining whether there is a mass in a chest X-ray
• Determining whether a brain tumor is present in an MRI
• Determining which areas of the brain have tumor from an MRI
• None of the above

Q2. What is the MAIN disadvantage of processing each MRI slice independently using a 2D segmentation model (as mentioned in the lecture)?

• You lose some context between slices
• It is difficult to register slices of MRI models
• 3D models are always better than 2D models
• None of the above

Q3. The U-net consists of...
• Just a contracting path
• Just an expanding path
• An expanding path followed by a contracting path
• A contracting path followed by an expanding path

Q4. Which of the following data augmentation is most effective for MRI sequences?
• Rotation
• Shuffling the slices
• Randomly shuffle the pixels in each slice
• Shifting each pixel to the right by a constant amount with wrap around

Q5. What is the soft dice loss for the example below?

• 0.089
• 0.544
• -0.089
• 0.910

Q6. Look at the output of model 1 and model 2:

Which one will have a lower soft dice loss?

Hint: Notice the prediction scores of P1 and P2 on the pixels where the ground truth is 1. This may help you focus on certain parts of the soft dice loss formula:

• They will be the same
• Model 1 has a lower loss
• Model 2 has a smaller loss
• None of the above

Q7. What is the minimum value of the soft dice loss?

• - infinity
• 0
• 1
• 4
Q8. An X-ray classification model is developed on data from US hospitals and is later tested on an external dataset from Latin America. Which if the following do you expect?
• Performance remains unchanged
• Performance improves on the new dataset
• Performance drops on the new dataset
• None of the above

Q9. Which of the following is an example of a prospective study?
• A model is deployed for 1 year in an emergency room and its performance over that time is evaluated
• A model is trained on data collected between 2001 and 2010 and then validated on data collected between 2011 and 2013
• A model is trained and tested on a dataset of X-rays collected between 2001 and 2010
• None of the above