Script 1325: Scripts Campaign Tagging

Purpose

The script parses campaign names to extract and tag seminar details such as format, location, registration target, and seminar code.

To Elaborate

The Python script is designed to automate the extraction of specific details from campaign names and tag them accordingly for seminar-related campaigns. It processes a dataset containing campaign names and identifies key components such as the format of the seminar, its location, the registration target, and a unique seminar code. These components are then added as new columns to the dataset, facilitating structured budget allocation (SBA) and analysis. The script assumes a specific structure in the campaign names, where the format is the first word, the location is a sequence of words before a month name, the registration target is a number followed by ‘hh’, and the seminar code follows a ‘DN-‘ pattern. This structured approach helps in organizing and analyzing marketing campaigns efficiently.

Walking Through the Code

  1. Configurable Parameters Setup
    • The script begins by defining configurable parameters, including the primary data source and the names of the output columns where extracted details will be stored.
    • The input DataFrame is assumed to have a column named ‘Campaign’, which contains the campaign names to be processed.
  2. Function Definition for Extraction
    • A function extract_seminar_details is defined to parse the campaign name and extract seminar details.
    • The function initializes default values for format, location, registration target, and seminar code.
    • It splits the campaign name into parts and extracts:
      • Format: Assumed to be the first part of the campaign name.
      • Location: Extracted from the parts following the format until a month name is encountered.
      • Registration Target: Identified as a number preceding ‘hh’ in the campaign name.
      • Seminar Code: Matches the pattern ‘DN-‘ followed by digits.
  3. Data Processing and Output
    • The script copies all input rows to an output DataFrame.
    • It applies the extract_seminar_details function to the ‘Campaign’ column, creating new columns for each extracted detail.
    • Finally, it displays the output DataFrame with the new columns, ensuring that the data is correctly tagged and ready for further analysis.

Vitals

  • Script ID : 1325
  • Client ID / Customer ID: 1306928223 / 60270455
  • Action Type: Bulk Upload (Preview)
  • Item Changed: Campaign
  • Output Columns: Account, Campaign, Location, Reg Target, Seminar Code, Format
  • Linked Datasource: M1 Report
  • Reference Datasource: None
  • Owner: Grégory Pantaine (gpantaine@marinsoftware.com)
  • Created by Grégory Pantaine on 2024-08-14 17:51
  • Last Updated by emerryfield@marinsoftware.com on 2024-08-15 18:37
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
##
## Name: Campaign Name - Dimension Auto Tagging: Seminar Details
## Description:
##  Parse Campaign Name and add Campaign-level Marin Dimensions Tags for Seminar Details.
##  Extracts Format, Location, Reg Target, and Seminar Code from the Campaign Name.
##
## author: Gregory Pantaine
## created: 2024-08-12
##

########### Configurable Params - START ##########
# Primary data source and columns
inputDf = dataSourceDict["1"]  # Assuming 'Campaign' column exists in this DataFrame

# Output columns
OUTPUT_COL_FORMAT = 'Format'
OUTPUT_COL_LOCATION = 'Location'
OUTPUT_COL_REG_TARGET = 'Reg Target'
OUTPUT_COL_SEMINAR_CODE = 'Seminar Code'
CAMPAIGN_COL = 'Campaign'

# Function to extract seminar details from the campaign name
def extract_seminar_details(campaign_name):
    # Initialize default values
    format = None
    location = None
    reg_target = None
    seminar_code = None

    # Split the campaign name into parts based on spaces
    parts = campaign_name.split()

    # Extract Format (assumed to be the first part)
    if len(parts) >= 1:
        format = parts[0]

    # Extract Location (assumed to be from second part up to a part containing a month name)
    months = ['January', 'February', 'March', 'April', 'May', 'June',
              'July', 'August', 'September', 'October', 'November', 'December']
    location_parts = []
    for part in parts[1:]:
        if part in months:
            break
        location_parts.append(part)
    if location_parts:
        location = ' '.join(location_parts).rstrip(',')

    # Extract Reg Target (number preceding 'hh')
    reg_target_match = re.search(r'(\d+)\w*hh', campaign_name)
    if reg_target_match:
        reg_target = reg_target_match.group(1)

    # Extract Seminar Code (pattern: DN- followed by digits)
    seminar_code_match = re.search(r'(DN-\d+)', campaign_name)
    if seminar_code_match:
        seminar_code = seminar_code_match.group(1)

    return pd.Series([format, location, reg_target, seminar_code])

# Copy all input rows to output
outputDf = inputDf.copy()

# Apply the extraction function to the 'Campaign' column
outputDf[[OUTPUT_COL_FORMAT, OUTPUT_COL_LOCATION, OUTPUT_COL_REG_TARGET, OUTPUT_COL_SEMINAR_CODE]] = outputDf[CAMPAIGN_COL].apply(extract_seminar_details)

# Display the output DataFrame with the new columns
if not outputDf.empty:
    print("outputDf", tableize(outputDf[[CAMPAIGN_COL, OUTPUT_COL_FORMAT, OUTPUT_COL_LOCATION, OUTPUT_COL_REG_TARGET, OUTPUT_COL_SEMINAR_CODE]]))
else:
    print("Empty outputDf")

########### Configurable Params - END ##########

Post generated on 2024-11-27 06:58:46 GMT

comments powered by Disqus