Script 1581: Script Set SFDC ID In Dimension
Purpose:
The script extracts and tags the SFDC ID from campaign names in a dataset by identifying the value after the last ‘|’ character.
To Elaborate
The Python script is designed to process a dataset containing campaign information and extract the SFDC ID from the campaign name. The SFDC ID is assumed to be located after the last ‘|’ character in the campaign name. The script iterates through each row of the dataset, checks if the campaign name contains the specified delimiter, and extracts the SFDC ID if present. The extracted ID is then tagged to the corresponding row in the output dataset. The script ensures that only non-empty SFDC IDs are included in the final output, and it also removes any extra whitespace from the campaign names to maintain data consistency.
Walking Through the Code
- Configurable Parameters:
-
The script begins by defining a configurable parameter PLACEMENT_KEY
, which is set to the ‘’ character. This character is used to split the campaign name to extract the SFDC ID.
-
- Data Source and Columns:
- The script retrieves the primary data source from a dictionary
dataSourceDict
and assigns it toinputDf
. It also defines several column names for both input and output dataframes, such as ‘Campaign’, ‘Account’, ‘SFDC ID’, and ‘Campaign ID’.
- The script retrieves the primary data source from a dictionary
- SFDC ID Extraction Function:
- A function
get_sfdc_id
is defined to split the campaign name using thePLACEMENT_KEY
. It returns the last segment of the split name as the SFDC ID if there are multiple segments; otherwise, it returnsnp.nan
.
- A function
- Data Processing:
- The script creates a copy of the input dataframe
inputDf
tooutputDf
for processing. - It iterates over each row in
inputDf
, extracting the campaign name and checking for the presence of thePLACEMENT_KEY
. - If the campaign name contains the delimiter, the script calls
get_sfdc_id
to extract the SFDC ID and updates the corresponding row inoutputDf
.
- The script creates a copy of the input dataframe
- Output Preparation:
- The script filters out rows with empty SFDC IDs from
outputDf
. - It also removes any extra whitespace from the campaign names to ensure data integrity.
- Finally, it prints the first few rows of the processed output dataframe if it is not empty.
- The script filters out rows with empty SFDC IDs from
Vitals
- Script ID : 1581
- Client ID / Customer ID: 1306928593 / 60270593
- Action Type: Bulk Upload (Preview)
- Item Changed: Campaign
- Output Columns: Account, Campaign, Campaign ID, GUID
- Linked Datasource: M1 Report
- Reference Datasource: None
- Owner: Grégory Pantaine (gpantaine@marinsoftware.com)
- Created by Grégory Pantaine on 2024-12-18 17:28
- Last Updated by ascott@marinsoftware.com on 2025-02-28 19:29
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
##
## name: Script AutoTag - SFDC ID Dimension setting
## description: Tags the SFDC ID value from the campaign name, to the right of the last | in the campaign name.
##
## author: G Pantaine with help from ChatGPT & M Huang.
## created: 2024-12-18
##
# Configurable Params - START
PLACEMENT_KEY = '|'
# Primary data source and columns
inputDf = dataSourceDict["1"]
# primary data source and columns
inputDf = dataSourceDict["1"]
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_CAMPAIGN_ID = 'Campaign ID'
RPT_COL_SFDC_ID = 'SFDC ID'
RPT_COL_CAMPAIGN_STATUS = 'Campaign Status'
RPT_COL_PUB_COST = 'Pub. Cost $'
# output columns and initial values
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_SFDC_ID = 'SFDC ID'
BULK_COL_CAMPAIGN_ID = 'Campaign ID'
# Function to extract the SFDC ID (value after the last '|')
def get_sfdc_id(campaign_name):
parts = campaign_name.split(PLACEMENT_KEY)
if len(parts) > 1:
return parts[-1].strip()
else:
return np.nan
# Copy all input rows to output
outputDf = inputDf.copy()
# Loop through all rows
for index, row in inputDf.iterrows():
campaign_name = row[RPT_COL_CAMPAIGN]
# Skip processing if campaign name does not contain at least one placement key
if campaign_name.count(PLACEMENT_KEY) < 1:
continue
sfdc_id = get_sfdc_id(campaign_name)
# Only tag if it's different than the existing tag
if pd.notna(sfdc_id):
outputDf.at[index, BULK_COL_SFDC_ID] = sfdc_id
else:
outputDf.at[index, BULK_COL_SFDC_ID] = np.nan
# Only include non-empty tags in bulk
outputDf = outputDf.dropna(subset=[BULK_COL_SFDC_ID])
# Remove extra whitespace from campaign name that breaks Preview
outputDf[RPT_COL_CAMPAIGN] = outputDf[RPT_COL_CAMPAIGN].str.strip()
if not outputDf.empty:
print("outputDf", outputDf.head().to_string())
else:
print("Empty outputDf")
Post generated on 2025-03-11 01:25:51 GMT