Script 1567: Assign Campaign to Strategy

Purpose:

The Python script assigns a strategy name to each campaign by extracting it from a specific field in the campaign data.

To Elaborate

The script processes a dataset containing campaign information to assign a strategy name to each campaign. It checks if the ‘Campaign’ column exists in the input data. If it does, the script extracts the last field from the ‘Campaign’ column, assuming it contains a strategy name, and assigns it to the ‘Strategy’ column in the output data. If the ‘Campaign’ column is missing, it handles this by setting the ‘Strategy’ column to a default value, such as NaN. The script also provides a summary of the unique strategies and campaigns, and it identifies the most and least frequent strategies. This helps in understanding the distribution of strategies across campaigns, which is crucial for structured budget allocation (SBA) and strategic planning.

Walking Through the Code

  1. Data Initialization
    • The script begins by defining the primary data source and initializing the output DataFrame with specific columns for ‘Account’, ‘Campaign’, ‘Strategy’, and ‘Campaign ID’.
    • The ‘Strategy’ column is initially set to a placeholder value "<<YOUR VALUE>>".
  2. Function Definition
    • A function extract_last_field is defined to extract the last field from a string if it contains exactly seven fields separated by ‘ ’. This function is used to parse the strategy name from the ‘Campaign’ column.
  3. Campaign Column Check
    • The script checks if the ‘Campaign’ column exists in the input DataFrame.
    • If it exists, the script applies the extract_last_field function to populate the ‘Strategy’ column in the output DataFrame.
    • If the ‘Campaign’ column does not exist, it prints a message and sets the ‘Strategy’ column to NaN.
  4. Output and Analysis
    • The script prints a confirmation message indicating that the ‘SFDC ID’ column has been added.
    • It calculates and prints the number of unique strategies and campaigns.
    • It analyzes the distribution of strategy values, identifying the top 10 least and most frequent strategies.
    • Finally, it prints the first few rows of the input DataFrame for verification purposes.

Vitals

  • Script ID : 1567
  • Client ID / Customer ID: 1306928593 / 60270593
  • Action Type: Bulk Upload (Preview)
  • Item Changed: Campaign
  • Output Columns: Account, Campaign, Strategy
  • Linked Datasource: M1 Report
  • Reference Datasource: None
  • Owner: ascott@marinsoftware.com (ascott@marinsoftware.com)
  • Created by ascott@marinsoftware.com on 2024-12-13 18:27
  • Last Updated by Michael Huang on 2025-01-15 05:15
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
##
## name: Set SFDC ID
## description:
##  
## 
## author: 
## created: 2024-12-13
## 

today = datetime.datetime.now(CLIENT_TIMEZONE).date()

# primary data source and columns
inputDf = dataSourceDict["1"]

# output columns and initial values
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_STRATEGY = 'Strategy'
BULK_COL_CAMPID = 'Campaign ID'
outputDf[BULK_COL_STRATEGY] = "<<YOUR VALUE>>"

# Function to extract the last field if there are exactly 7 fields
def extract_last_field(value):
    fields = value.split('|')
    return fields[-1].strip() if len(fields) == 7 else ''



# Check if the 'Campaign' column exists in the inputDf DataFrame
if BULK_COL_CAMPAIGN in inputDf.columns:

    # parse out last field as Strategy name
    outputDf[BULK_COL_STRATEGY] = outputDf[BULK_COL_CAMPAIGN].apply(extract_last_field)
else:
    # If the 'Campaign' column does not exist, you might want to handle this case appropriately
    print(f"The column '{BULK_COL_CAMPAIGN}' does not exist in the input DataFrame.")
    # You could set the SFDC ID column to NaN or some default value
    outputDf[BULK_COL_STRATEGY] = np.nan

# Print the result
print("SFDC ID column added to outputDf")

print("Strategy count: ", outputDf[BULK_COL_STRATEGY].nunique())
print("Campaign count: ", outputDf[BULK_COL_CAMPAIGN].nunique())
print("OutputDf.shape", outputDf.shape)

# Count value distribution for the parsed field
value_distribution = outputDf[BULK_COL_STRATEGY].value_counts()

# Sort by least frequent values and print the top 10
least_frequent = value_distribution.sort_values(ascending=True).head(10)
most_frequent = value_distribution.sort_values(ascending=False).head(10)


print("Top 10 Least Frequent Strategies:")
print(least_frequent)

print("Top 10 Most Frequent Strategies:")
print(most_frequent)

# Print the first few rows of the input DataFrame for verification
print(tableize(inputDf.head()))

Post generated on 2025-03-11 01:25:51 GMT

comments powered by Disqus