Script 1325: Scripts Campaign Tagging

Purpose:

The Python script parses campaign names to extract and tag seminar details such as format, location, registration target, and seminar code.

To Elaborate

The script is designed to automate the extraction of specific details from campaign names and tag them with corresponding dimensions for seminar details. It processes a dataset containing campaign names and extracts four key pieces of information: the format of the seminar, the location, the registration target, and a unique seminar code. These details are then added as new columns to the dataset, facilitating structured budget allocation (SBA) and analysis. The script assumes a specific structure in the campaign names, such as the format being the first word, the location being a sequence of words before a month name, the registration target being a number followed by ‘hh’, and the seminar code following a ‘DN-‘ pattern. This structured approach allows for consistent and automated data tagging, which is crucial for efficient data management and reporting.

Walking Through the Code

Configurable Parameters:
- The script begins by defining configurable parameters, including the primary data source (inputDf) and the names of the output columns for the extracted details.
Function Definition:
- A function extract_seminar_details is defined to parse the campaign name. It initializes default values for format, location, registration target, and seminar code.
- The campaign name is split into parts, and the first part is assumed to be the format.
- The location is extracted from the parts following the format until a part containing a month name is encountered.
- The registration target is identified using a regular expression that searches for a number followed by ‘hh’.
- The seminar code is extracted using a regular expression that matches the pattern ‘DN-‘ followed by digits.
Data Processing:
- The script creates a copy of the input DataFrame (outputDf) to preserve the original data.
- It applies the extract_seminar_details function to the ‘Campaign’ column, populating the new columns with the extracted details.
Output:
- The script checks if the output DataFrame is not empty and then displays it with the new columns added. If the DataFrame is empty, it prints a message indicating so.

Vitals

Script ID : 1325
Client ID / Customer ID: 1306928223 / 60270455
Action Type: Bulk Upload (Preview)
Item Changed: Campaign
Output Columns: Account, Campaign, Location, Reg Target, Seminar Code, Format
Linked Datasource: M1 Report
Reference Datasource: None
Owner: Grégory Pantaine (gpantaine@marinsoftware.com)
Created by Grégory Pantaine on 2024-08-14 17:51
Last Updated by emerryfield@marinsoftware.com on 2024-08-15 18:37

> See it in Action

Python Code

##
## Name: Campaign Name - Dimension Auto Tagging: Seminar Details
## Description:
##  Parse Campaign Name and add Campaign-level Marin Dimensions Tags for Seminar Details.
##  Extracts Format, Location, Reg Target, and Seminar Code from the Campaign Name.
##
## author: Gregory Pantaine
## created: 2024-08-12
##

########### Configurable Params - START ##########
# Primary data source and columns
inputDf = dataSourceDict["1"]  # Assuming 'Campaign' column exists in this DataFrame

# Output columns
OUTPUT_COL_FORMAT = 'Format'
OUTPUT_COL_LOCATION = 'Location'
OUTPUT_COL_REG_TARGET = 'Reg Target'
OUTPUT_COL_SEMINAR_CODE = 'Seminar Code'
CAMPAIGN_COL = 'Campaign'

# Function to extract seminar details from the campaign name
def extract_seminar_details(campaign_name):
    # Initialize default values
    format = None
    location = None
    reg_target = None
    seminar_code = None

    # Split the campaign name into parts based on spaces
    parts = campaign_name.split()

    # Extract Format (assumed to be the first part)
    if len(parts) >= 1:
        format = parts[0]

    # Extract Location (assumed to be from second part up to a part containing a month name)
    months = ['January', 'February', 'March', 'April', 'May', 'June',
              'July', 'August', 'September', 'October', 'November', 'December']
    location_parts = []
    for part in parts[1:]:
        if part in months:
            break
        location_parts.append(part)
    if location_parts:
        location = ' '.join(location_parts).rstrip(',')

    # Extract Reg Target (number preceding 'hh')
    reg_target_match = re.search(r'(\d+)\w*hh', campaign_name)
    if reg_target_match:
        reg_target = reg_target_match.group(1)

    # Extract Seminar Code (pattern: DN- followed by digits)
    seminar_code_match = re.search(r'(DN-\d+)', campaign_name)
    if seminar_code_match:
        seminar_code = seminar_code_match.group(1)

    return pd.Series([format, location, reg_target, seminar_code])

# Copy all input rows to output
outputDf = inputDf.copy()

# Apply the extraction function to the 'Campaign' column
outputDf[[OUTPUT_COL_FORMAT, OUTPUT_COL_LOCATION, OUTPUT_COL_REG_TARGET, OUTPUT_COL_SEMINAR_CODE]] = outputDf[CAMPAIGN_COL].apply(extract_seminar_details)

# Display the output DataFrame with the new columns
if not outputDf.empty:
    print("outputDf", tableize(outputDf[[CAMPAIGN_COL, OUTPUT_COL_FORMAT, OUTPUT_COL_LOCATION, OUTPUT_COL_REG_TARGET, OUTPUT_COL_SEMINAR_CODE]]))
else:
    print("Empty outputDf")

########### Configurable Params - END ##########

Post generated on 2025-03-11 01:25:51 GMT

14 Aug 2024

« Script 1323: Ascend Auto Cap Script 1327: Set KW Template »

MarinOne Scripts Creator's Corner