Script 1301: Extract Studio and Feed Category and Language Target

Purpose:

The Python script processes a DataFrame to extract and categorize information about campaigns, including studio names, feed categories, and language targets.

To Elaborate

The script is designed to process a DataFrame containing campaign data, extracting specific information to populate a new DataFrame with structured details. It focuses on identifying and categorizing elements such as the studio name, feed category, and language target from the campaign names. The script applies a set of rules to determine these categories based on patterns found in the campaign names, such as identifying whether a campaign is related to a movie or TV show. Additionally, it extracts language targets and assigns a constant value to a ‘c-check’ column. This structured approach helps in organizing campaign data for further analysis or reporting.

Walking Through the Code

Initialization:
- The script begins by defining the primary data source, inputDf, which is a DataFrame containing campaign data.
- It initializes an empty DataFrame, df_out, with specific columns to store the processed data.
Processing Each Row:
- The script iterates over each row in the inputDf.
- For each campaign, it extracts the studio name by splitting the campaign string at hyphens and taking the first part.
- It determines the feed category by checking for specific substrings in the campaign name, such as “- Movie -“ or “- TV Show -“, and assigns a category accordingly.
- It extracts the language target by taking the first two characters after the last hyphen in the campaign name.
- A constant value “YES” is assigned to the ‘c-check’ column for each row.
Constructing and Appending Rows:
- A new row is constructed as a dictionary with the extracted and processed information.
- This new row is appended to the df_out DataFrame.
Output:
- The processed DataFrame, df_out, is returned as the output of the function.

Vitals

Script ID : 1301
Client ID / Customer ID: 1306912147 / 69058
Action Type: Bulk Upload
Item Changed: Campaign
Output Columns: Account, Campaign, studio_name, Feed Category, Language Target, c-check
Linked Datasource: M1 Report
Reference Datasource: None
Owner: Jeremy Brown (jbrown@marinsoftware.com)
Created by Jeremy Brown on 2024-07-29 16:51
Last Updated by Jeremy Brown on 2024-07-30 12:48

> See it in Action

Python Code

##
## name: SCRIPT REPORT: Extract Studio and Feed Category and Language Target
## description:
## 
## author: Jeremy Brown
## created: 2024-07-29
## 

today = datetime.datetime.now(CLIENT_TIMEZONE).date()

# primary data source and columns
inputDf = dataSourceDict["1"]
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_STUDIO_NAME = 'studio_name'
RPT_COL_FEED_CATEGORY = 'Feed Category'
RPT_COL_LANGUAGE_TARGET = 'Language Target'
RPT_COL_CAMPAIGN_LANGUAGE = 'Campaign Language'
RPT_COL_CCHECK = 'c-check'
RPT_COL_CAMPAIGN_STATUS = 'Campaign Status'
RPT_COL_IMPR = 'Impr.'

def process(inputDf):
    """
    Process the input DataFrame to populate output DataFrame with specific columns
    and create new 'studio_name', 'Feed Category', 'Language Target', 'Campaign Language' and 'c-check'
    based on the 'Campaign'.
    """
    # Initialize an empty DataFrame for output with required columns
    df_out = pd.DataFrame(columns=["Campaign", "Account", "Feed Category", "studio_name", "Language Target", "Campaign Language", "c-check"])
    
    # Iterate through each row in the input DataFrame
    for idx, row in inputDf.iterrows():
        campaign = row["Campaign"]
        
        # Task 1: Extract the studio name
        studio_name = campaign.split('-')[0].strip()
        
        # Task 2: Determine the Feed Category
        if "- Movie -" in campaign:
            feed_category = "Movie"
        elif "- TV Show - Seasons -" in campaign:
            feed_category = "Seasons"
        elif "- TV Show -" in campaign:
            feed_category = "TV Show"
        else:
            feed_category = ""  # Default value if none of the conditions match
        
        # Task 3: Extract the first 2 characters after the last hyphen
        language_target = campaign.split('-')[-1].strip()[:2]
        
        # Task 4: Insert "YES" into 'c-check'
        c_check = "YES"
        
        # Construct the new row as a dictionary
        new_row = {
            "Campaign": row["Campaign"],
            "Account": row["Account"],
            "Feed Category": feed_category,
            "studio_name": studio_name,
            "Language Target": language_target,
            "Campaign Language": language_target,
            "c-check": c_check
        }
        
        # Append the new row to the output DataFrame
        df_out = pd.concat([df_out, pd.DataFrame([new_row])], ignore_index=True)
    
    # Print the data changed for debugging purposes
    print("Data changed:")
    print(df_out)
    
    return df_out

# Trigger the main process
outputDf = process(inputDf)

Post generated on 2025-03-11 01:25:51 GMT

29 Jul 2024

« Script 1299: Benchmark Assignments from Gsheet Script 1303: Weekly Conversion Anomaly Detection Dental Network »

MarinOne Scripts Creator's Corner