In today’s era of highly efficient data collection methods, it is increasingly possible to come across blank entries within datasets comprising millions of rows.
However, it is still possible for blanks to appear from sources other than the point of data collection.
These empty values can significantly impact the accuracy, calculations, and insights derived from a dataset. Fortunately, various techniques exist for addressing blank entries, ranging from simple to more complex solutions.
In this post, you will discover how to effectively replace blank entries with zeros in your dataset using the powerful capabilities of Power BI.
Replace Blank with Zero in Power Query
Power Query is Power BI’s native data transformation and preparation tool and it’s also available in Microsoft Excel.
With Power Query, you can easily connect to various data sources, transform, and shape data according to your needs.
When loading data into Power Query, blank entries are displayed as null values. While there can be various reasons for this occurrence, Power Query provides a range of tools that can effectively address these blank rows, regardless of the underlying cause.
Replace Blank with 0 using Replace Values
The Replace Values feature in Power Query is very similar in functionality to the find and replace feature you find on most software programs. With Replace Values, you can effortlessly identify blank cells and swiftly substitute them with zeros.
Here’s how to use it.
- Select the columns containing the blank entries. To select multiple columns, hold down the Shift key and use the right-click mouse button to select the columns.
- Go to the Home ribbon and select Replace Values. You will find it under the Transform section.
- By clicking on Replace Values, the Replace Values dialogue window will open. In this window, you can specify the value you want to find, which, in this case, is null, in the Value to Find box. Then, you can determine the value that will replace the null entries by entering 0 in the Replace With box.
- Click OK when you’re done.
After completing the process, the selected columns will have the blank cells replaced with 0.
Replace Blank Cells with Add Column
Within Power Query, the Add Column ribbon offers a user-friendly and efficient method for creating new columns by applying calculations, transformations, or merging operations to existing ones.
This functionality provides a wide range of functions and options that is powered by the Mashup programming language (m query), which is a data transformation programming language available in Power Query.
By leveraging the m query, you can utilize conditional logic, manipulate text, and conduct mathematical operations to customize the new columns according to your precise analytical needs.
To replace blank cells with 0 using the Add Column feature, follow these steps.
- Go to the Add Column ribbon and select Custom Column.
if [Price] = null
then 0
else
[Price]
- When you click on Custom Column, it opens the Custom Column dialogue box. In this dialogue window, you need to copy and paste the provided syntax into the Custom column formula box.
The syntax utilizes an if statement to check each row in the [Price] column. If a row contains a null value, the formula will return 0 as a replacement. On the other hand, any non-null value in the [Price] column will be returned as it is.
After pasting the syntax and clicking on the OK button in the previous window, Power Query will carry out the calculations using the syntax and return the results in a new column.
While this is a simple approach, the downside is that you will have to repeat the same steps if you want to remove blanks for multiple columns. Fortunately, the next method removes this redundancy.
Replace Blank Cells with Advanced Editor
The Advanced Editor is a valuable tool that helps you to accomplish intricate data transformations. By providing direct access to the underlying M code behind each data transformation operation, the Advanced Editor allows for more sophisticated manipulations.
When it comes to replacing blank values, the Advanced Editor offers a distinct advantage. Unlike the standard interface, it enables you to replace null values directly within the original column, eliminating the need to create a new column.
Since the transformation will take place within the parent column, it keeps the original data type.
Using the Advanced Editor is perhaps the most efficient and streamlined approach for carrying out most data transformation tasks effectively in Power Query, including replacing blank cells.
To use the Advanced Editor to replace blank cells, go to the Query section in the Home tab and click on Advanced Editor.
When you access the Advanced Editor dialogue box in Power Query, you will unveil the underlying source code that drives the data processing.
The code in the editor is a result of every action carried out in Power Query, right from the moment of importing the data up to the last data manipulation action done.
#"Replace Null" = Table.ReplaceValue(#"Changed Type", null, 0, Replacer.ReplaceValue, {"Price", "Quantity"})
To replace blank cells in the [Price] and [Quantity] columns with 0, you can utilize the Table.ReplaceValue() function. This syntax provides a convenient way to achieve the same result as the Custom Column feature in a more efficient manner.
By using the Table.ReplaceValue() function, you can effectively identify blank cells in multiple columns and replace them with 0.
The function has the following arguments:
table
– the table or query with the column containing the null values.oldValue
– the value you want to replace.newValue
– the value you want to replace theoldValue
with.replacer
– the function that will carry out the replacement operation.columnsToSearch
– a list of column(s) where the replacement will take place.
In the syntax, the table argument is set as #”Changed Type” since it is a variable that represents a table. To specify the old and new values for replacement, null and 0 are used respectively.
The Replacer.ReplaceValue function is employed in the replace argument to substitute values in the [Price] and [Quantity] columns.
This function is assigned to the variable Replace Null, which is then evaluated within the In statement. By utilizing this syntax, you can effectively replace null values with 0 in the [Price] and [Quantity] columns.
To add a new line of code to the Let statement and assign it to a variable name in M language, copy and paste the syntax into the In statement of the query. But ensure to add a comma to the final line of code.
Click on the Done button after pasting the code to the editor.
Once all null values have been replaced with 0 in the column, the formula bar will show the updated syntax used to create the column, and you can easily see that there are no cells with nulls.
Replace Null Values with DAX
To address null values in Power BI, an alternative approach is to utilize DAX, the formula language used in Power BI for custom calculations. By creating a calculated column, you can replace null values as desired.
However, before creating the calculated column, it is essential to load your data into Power BI. This ensures that you have the necessary dataset available for applying the DAX formulas and achieving the desired replacement of null values.
Now that the data is loaded into Power BI, cells that displayed nulls are now empty. To create a calculated column, navigate to the Table view by clicking on the grid icon. From there, select the New column option in the Table tools tab.
Price (no blanks) = IF(
ISBLANK(
'In'[Price]
),
0,
'In'[Price]
)
Copy and paste the provided syntax into the formula bar. This syntax utilizes the IF function to create a new column named [Price (nulls replaced)].
The IF function works by checking for blank cells in the [Price] column and replacing them with 0 while retaining non-blank values as they are.
However, it’s important to note that while a calculated column can help address blank or null values in your data, it may not be the most efficient approach. This is because calculated columns can increase memory usage by adding extra columns to your dataset.
Furthermore, unlike Power Query, the original column cannot be deleted since the new column depends on it. Therefore, careful consideration should be given to the pros and cons of using calculated columns in your data modelling process.
Conclusion
Power Query and DAX are instrumental tools within Power BI, offering efficient data transformation and analysis capabilities.
When it comes to replacing blanks with zeros, the most efficient approach is to perform this transformation in Power Query before loading the data into Power BI. This not only eliminates redundancy but also optimizes memory usage.
However, calculated columns can still be valuable in certain scenarios. Understanding the appropriate situations to utilize each method is key to maximizing their effectiveness and achieving optimal results in your data processing and analysis workflows.
0 Comments