Script 367: SCRIPT Assign Portfolio to 'Amazon Portfolio' Dimension

Purpose

The Python script updates the “Amazon Portfolio” column in a DataFrame to match the “Portfolio” column when they differ or when “Amazon Portfolio” is empty.

To Elaborate

The script is designed to ensure consistency between two columns in a DataFrame: “Portfolio” and “Amazon Portfolio.” It checks each row to see if the “Amazon Portfolio” value is different from the “Portfolio” value or if it is empty. If either condition is true, it updates the “Amazon Portfolio” to match the “Portfolio.” This process is useful for maintaining data integrity, particularly in scenarios where the “Amazon Portfolio” should always reflect the same value as “Portfolio” unless explicitly set otherwise. The script then identifies and returns only the rows where changes were made, allowing for efficient tracking of updates.

Walking Through the Code

  1. Initialization
    • The script begins by defining a function named process that takes a DataFrame inputDf as its parameter.
    • It creates a copy of inputDf named outputDf to work with, ensuring the original data remains unchanged.
  2. Iterating and Updating
    • The script iterates over each row in outputDf.
    • For each row, it checks if the “Amazon Portfolio” is different from “Portfolio” or if it is null/blank.
    • If any of these conditions are met, it updates the “Amazon Portfolio” to match the “Portfolio.”
  3. Filtering Changes
    • After updating, the script filters the DataFrame to include only rows where the “Amazon Portfolio” was changed.
    • This is done by comparing the updated “Amazon Portfolio” with the original values in inputDf.
  4. Output
    • The function returns the filtered DataFrame containing only the rows with changes, allowing users to see which entries were updated.

Vitals

  • Script ID : 367
  • Client ID / Customer ID: 1306926773 / 50395
  • Action Type: Bulk Upload
  • Item Changed: Campaign
  • Output Columns: Account, Campaign, Amazon Portfolio
  • Linked Datasource: M1 Report
  • Reference Datasource: None
  • Owner: Jeremy Brown (jbrown@marinsoftware.com)
  • Created by Jeremy Brown on 2023-10-17 11:10
  • Last Updated by Jeremy Brown on 2023-12-06 04:01
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#def process(inputDf):
    # Create output DataFrame with the desired columns
#    outputDf = pd.DataFrame(columns=["Campaign", "Account", "Publisher", "Portfolio", "Amazon Portfolio"])

# Define the process function
def process(inputDf):
    # Create a copy of the input DataFrame to work with
    outputDf = inputDf.copy()
    print(tableize(inputDf))
    # Iterate through each row in the DataFrame
    for index, row in outputDf.iterrows():
        portfolio = row["Portfolio"]
        amazon_portfolio = row["Amazon Portfolio"]

        # Check if Amazon Portfolio is different from Portfolio or is null/blank, and if so, update it
        if amazon_portfolio != portfolio or pd.isna(amazon_portfolio) or amazon_portfolio.strip() == "":
            outputDf.at[index, "Amazon Portfolio"] = portfolio

    # Filter the changed rows
    # changed_rows = outputDf[outputDf["Amazon Portfolio"] != inputDf["Amazon Portfolio"]]
    changed_rows = outputDf[(outputDf["Amazon Portfolio"] != inputDf["Amazon Portfolio"]) | (inputDf["Amazon Portfolio"].isnull() & outputDf["Amazon Portfolio"].notnull())]

    # Print the changed data for debugging

    return changed_rows

# Trigger the main process
outputDf = process(inputDf)

Post generated on 2024-11-27 06:58:46 GMT

comments powered by Disqus