w3resource

R Programming: Count the Number of NA values in a Data Frame column


Write a R program to count the number of NA values in a data frame column.

Sample Solution:

R Programming Code:

# Create a data frame named 'exam_data' with columns 'name', 'score', 'attempts', and 'qualify'
exam_data = data.frame(
  name = c('Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'),  # Column with names
  score = c(12.5, 9, 16.5, 12, 9, 20, 14.5, 13.5, 8, 19),  # Column with scores
  attempts = c(1, NA, 2, NA, 2, NA, 1, NA, 2, 1),  # Column with attempts, including some NA values
  qualify = c('yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes')  # Column with qualification status
)

# Print the message indicating that the following output is the original dataframe
print("Original dataframe:")

# Print the content of the 'exam_data' data frame
print(exam_data)

# Print the message indicating that the following output shows the number of NA values in the 'attempts' column
print("The number of NA values in attempts column:")

# Calculate and print the number of NA values in the 'attempts' column of the 'exam_data' data frame
print(sum(is.na(exam_data$attempts)))

Output:

[1] "Original dataframe:"
        name score attempts qualify
1  Anastasia  12.5        1     yes
2       Dima   9.0       NA      no
3  Katherine  16.5        2     yes
4      James  12.0       NA      no
5      Emily   9.0        2      no
6    Michael  20.0       NA     yes
7    Matthew  14.5        1     yes
8      Laura  13.5       NA      no
9      Kevin   8.0        2      no
10     Jonas  19.0        1     yes
[1] "The number of NA values in attempts column:"
[1] 4

Explanation:

  • Create a Data Frame:
    • exam_data = data.frame(...)
      • Creates a data frame named exam_data with four columns: name, score, attempts, and qualify.
      • The name column contains names of individuals.
      • The score column contains numerical scores.
      • The attempts column contains numerical values and some NA (missing values).
      • The qualify column contains qualification status as 'yes' or 'no'.
  • Print Original Data Frame:
    • print("Original dataframe:")
      • Prints the message "Original dataframe:" to indicate that the following output is the original data frame.
    • print(exam_data)
      • Prints the exam_data data frame to the console.
  • Print Number of NA Values:
    • print("The number of NA values in attempts column:")
      • Prints the message "The number of NA values in attempts column:" to indicate that the following output is the count of NA values in the attempts column.
    • print(sum(is.na(exam_data$attempts)))
      • is.na(exam_data$attempts) creates a logical vector identifying NA values in the attempts column.
      • sum(...) calculates the total number of NA values.
      • print(...) prints the count of NA values to the console.

    R Programming Code Editor:



    Have another way to solve this solution? Contribute your code (and comments) through Disqus.

    Previous: Write a R program to save the information of a data frame in a file and display the information of the file.
    Next: Write a R program to create a data frame using two given vectors and display the duplicated elements and unique rows of the said data frame.

    Test your Programming skills with w3resource's quiz.

What is the difficulty level of this exercise?



Follow us on Facebook and Twitter for latest update.