Script 1595: Script Autotag Campaign LOB

Purpose:

The Python script extracts the Line of Business (LOB) tag from campaign names by identifying the value between the first and second underscore.

To Elaborate

The script is designed to process campaign data and extract a specific tag known as the Line of Business (LOB) from campaign names. This tag is located between the first and second underscore in the campaign name. The script reads data from a primary data source, processes the campaign names to identify and extract the LOB, and then creates an output data frame containing the account and campaign information along with the extracted LOB. Rows where the LOB extraction fails are removed from the output, ensuring only valid entries are retained. This process helps in categorizing campaigns based on their business line, which can be useful for reporting and analysis purposes.

Walking Through the Code

  1. Data Preparation
    • The script begins by importing necessary modules and setting up the current date using the datetime module.
    • It defines the primary data source and specifies the columns to be used from this data source.
  2. Function Definition
    • A function named extract_lob is defined to extract the LOB from the campaign name. It splits the campaign name by underscores and returns the second element if available, or None if the format does not match or an error occurs.
  3. Data Processing
    • An output DataFrame is created by copying relevant columns from the input DataFrame.
    • The extract_lob function is applied to the campaign names in the output DataFrame to populate the LOB column.
  4. Data Cleaning
    • Rows where the LOB extraction results in None are removed from the output DataFrame to ensure only valid entries are retained.
  5. Output
    • The script prints the first few rows of both the input and output DataFrames for debugging purposes, showcasing the transformation and extraction process.

Vitals

  • Script ID : 1595
  • Client ID / Customer ID: 1306916369 / 2
  • Action Type: Bulk Upload (Preview)
  • Item Changed: Campaign
  • Output Columns: Account, Campaign, Line of Business_Dim
  • Linked Datasource: M1 Report
  • Reference Datasource: None
  • Owner: Grégory Pantaine (gpantaine@marinsoftware.com)
  • Created by Grégory Pantaine on 2024-12-23 20:51
  • Last Updated by Grégory Pantaine on 2024-12-23 20:51
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
##
## name: Script Autotag Campaign LOB
## description:
##  
## 
## author: 
## created: 2024-12-23
## 
##
## name: Script Autotag Campaign LOB
## description: Extracts Line of Business tag from campaign names (value between first and second underscore)
## 
## author: 
## created: 2024-12-23
## 

import datetime

today = datetime.datetime.now(CLIENT_TIMEZONE).date()

# primary data source and columns
inputDf = dataSourceDict["1"]
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_CAMPAIGN_STATUS = 'Campaign Status'
RPT_COL_CAMPAIGN_TYPE = 'Campaign Type'
RPT_COL_IMPR = 'Impr.'
RPT_COL_CLICKS = 'Clicks'
RPT_COL_CTR = 'CTR %'
RPT_COL_DAILY_BUDGET = 'Daily Budget'
RPT_COL_LAST_MODIFIED_DATE = 'Last Modified Date'
RPT_COL_RECOMMENDED_DAILY_BUDGET = 'Recommended Daily Budget'
RPT_COL_PUB_COST = 'Pub. Cost £'
RPT_COL_CAMPAIGN_CREATION_DATE = 'Campaign Creation Date'
RPT_COL_PUB_ID = 'Pub. ID'
RPT_COL_CAMPAIGN_OVERRIDE = 'Campaign Override'
RPT_COL_NETWORK = 'Network'
RPT_COL_NETWORK_TARGETING = 'Network Targeting'
RPT_COL_PUBLISHER = 'Publisher'
RPT_COL_PUBLISHER_BID_CAP = 'Publisher Bid Cap'
RPT_COL_PUBLISHER_BID_PORTFOLIO_NAME = 'Publisher Bid Portfolio Name'
RPT_COL_PUBLISHER_BID_STRATEGY = 'Publisher Bid Strategy'
RPT_COL_PUBLISHER_BID_STRATEGY_LEVEL = 'Publisher Bid Strategy Level'
RPT_COL_PUBLISHER_NAME = 'Publisher Name'
RPT_COL_PUBLISHER_TARGET_IS = 'Publisher Target IS %'
RPT_COL_PUBLISHER_TARGET_IS_TYPE = 'Publisher Target IS Type'
RPT_COL_PUBLISHER_TARGET_ROAS = 'Publisher Target ROAS'
RPT_COL_PUBLISHER_TARGET_CPA = 'Publisher Target CPA'
RPT_COL_COUNTRY_DIM = 'Country_Dim'
RPT_COL_FRIENDLYNAME_DIM = 'FriendlyName_Dim'
RPT_COL_LINE_OF_BUSINESS_DIM = 'Line of Business_Dim'
RPT_COL_PROGRAM_DIM = 'Program_Dim'

# output columns and initial values
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_LINE_OF_BUSINESS_DIM = 'Line of Business_Dim'

# Function to extract value between first and second underscore
def extract_lob(campaign_name):
    try:
        parts = campaign_name.split('_')
        if len(parts) >= 2:
            return parts[1]
        return None  # Return None if format doesn't match
    except:
        return None  # Return None for any errors

# Create output dataframe with required columns
outputDf = inputDf[[RPT_COL_ACCOUNT, RPT_COL_CAMPAIGN]].copy()
outputDf.columns = [BULK_COL_ACCOUNT, BULK_COL_CAMPAIGN]

# Apply the extraction function to get LOB
outputDf[BULK_COL_LINE_OF_BUSINESS_DIM] = outputDf[BULK_COL_CAMPAIGN].apply(extract_lob)

# Remove rows where LOB extraction failed (where value is None)
outputDf = outputDf.dropna(subset=[BULK_COL_LINE_OF_BUSINESS_DIM])

# Debug print first few rows
print(tableize(inputDf.head()))
print("\nOutput DataFrame:")
print(tableize(outputDf.head()))

Post generated on 2025-03-11 01:25:51 GMT

comments powered by Disqus