w3resource

Compare Two Data Frames in R to Find Unique Rows


Write a R program to compare two data frames to find the row(s) in first data frame that are not present in second data frame.

Sample Solution:

R Programming Code:

# Create the first data frame with sales data for three items
df_90 = data.frame(
  "item" = c("item1", "item2", "item3"),        # Column for item names
  "Jan_sale" = c(12, 14, 12),                   # Sales in January
  "Feb_sale" = c(11, 12, 15),                   # Sales in February
  "Mar_sale" = c(12, 14, 15)                    # Sales in March
)

# Create the second data frame with sales data for three items
df_91 = data.frame(
  "item" = c("item1", "item2", "item3"),        # Column for item names
  "Jan_sale" = c(12, 14, 12),                   # Sales in January
  "Feb_sale" = c(11, 12, 15),                   # Sales in February
  "Mar_sale" = c(12, 15, 18)                    # Sales in March
)

# Print a message indicating the following output is the original dataframes
print("Original Dataframes:")

# Print the first data frame
print(df_90)

# Print the second data frame
print(df_91)

# Print a message indicating the following output is the rows in the first data frame not present in the second data frame
print("Row(s) in first data frame that are not present in second data frame:")

# Find and print rows in the first data frame that are not present in the second data frame
print(setdiff(df_90, df_91))

Output:

[1] "Original Dataframes:"
   item Jan_sale Feb_sale Mar_sale
1 item1       12       11       12
2 item2       14       12       14
3 item3       12       15       15
   item Jan_sale Feb_sale Mar_sale
1 item1       12       11       12
2 item2       14       12       15
3 item3       12       15       18
[1] "Row(s) in first data frame that are not present in second data frame:"
$Mar_sale
[1] 12 14 15

Explanation:

  • Create Data Frames:
    • df_90 and df_91 are two data frames created with sales data for three items.
    • Both data frames have columns: item, Jan_sale, Feb_sale, and Mar_sale.
  • Define df_90:
    • item: Contains item names ("item1", "item2", "item3").
    • Jan_sale: Sales figures for January (12, 14, 12).
    • Feb_sale: Sales figures for February (11, 12, 15).
    • Mar_sale: Sales figures for March (12, 14, 15).
  • Define df_91:
    • item: Contains item names ("item1", "item2", "item3").
    • Jan_sale: Sales figures for January (12, 14, 12).
    • Feb_sale: Sales figures for February (11, 12, 15).
    • Mar_sale: Sales figures for March (12, 15, 18).
  • Print Original Data Frames:
    • Outputs a message indicating that the following data are the original data frames.
    • Prints the contents of df_90.
    • Prints the contents of df_91.
  • Find and Print Differences:
    • Outputs a message indicating that the following data are the rows in df_90 that are not present in df_91.
    • Uses setdiff(df_90, df_91) to find and print rows that are in df_90 but not in df_91.

R Programming Code Editor:



Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a R program to reorder an given data frame by column name.
Next: Write a R program to find elements which are present in two given data frames.

Test your Programming skills with w3resource's quiz.

What is the difficulty level of this exercise?



Follow us on Facebook and Twitter for latest update.