Script 155: brand nonbrand

Purpose:

The Python script categorizes marketing campaigns into “Brand” or “Non-Brand” based on specific naming patterns.

To Elaborate

The Python script is designed to process a dataset of marketing campaigns and classify each campaign as either “Brand” or “Non-Brand.” This classification is based on the presence of specific keywords within the campaign names. The script uses regular expressions to identify campaigns that contain the word “brand” but do not match the pattern for “non-brand,” and assigns them the “Brand” label. Conversely, campaigns that either do not contain “brand” or match the “non-brand” pattern are labeled as “Non-Brand.” This categorization helps in structured budget allocation (SBA) by distinguishing between brand-focused and non-brand-focused marketing efforts.

Walking Through the Code

  1. UUID Generation:
    • The script defines a function string_to_uuid that converts a given string into a UUID using SHA-1 hashing within a predefined namespace. This is useful for generating unique identifiers for strings.
  2. Temporary Field Setup:
    • A temporary field TMP_FIELD is created in the input DataFrame to store the classification results. This field is initially set to NaN.
  3. Regular Expression Compilation:
    • A regular expression pattern is compiled to identify “non-brand” campaigns. The pattern looks for variations of “non-brand” with optional spaces or hyphens.
  4. Campaign Classification:
    • The script classifies campaigns by checking if the campaign name contains the word “brand” but not the “non-brand” pattern. Such campaigns are labeled as “Brand.” Others are labeled as “Non-Brand.”
  5. Output Preparation:
    • The classification results are copied to the output DataFrame. Only campaigns with a changed classification strategy are included in the final output, ensuring that only relevant updates are processed.

Vitals

  • Script ID : 155
  • Client ID / Customer ID: 1306922573 / 2
  • Action Type: Bulk Upload
  • Item Changed: Campaign
  • Output Columns: Account, Campaign, Brand vs NonBrand
  • Linked Datasource: M1 Report
  • Reference Datasource: None
  • Owner: Jonathan Reichl (jreichl@marinsoftware.com)
  • Created by Jonathan Reichl on 2023-05-31 10:13
  • Last Updated by Michael Huang on 2024-01-12 03:50
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_BRAND_VSNONBRAND = 'Brand vs NonBrand'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_BRAND_VSNONBRAND = 'Brand vs NonBrand'

#outputDf[BULK_COL_BRAND_VSNONBRAND] = "<<YOUR VALUE>>"

def string_to_uuid(input_string):
    # Hash the input string using MD5 (you can use other hash functions too)
    # hashed = hashlib.hashlib.md5(input_string.encode()).hexdigest()
    
    import uuid
    # Use a predefined namespace UUID, for example the DNS namespace
    namespace_uuid = uuid.NAMESPACE_DNS
    # Create a UUID using SHA-1 hashing
    return str(uuid.uuid5(namespace_uuid, input_string))

input_str = "your_input_string_here"
resulting_uuid = string_to_uuid(input_str)
print(resulting_uuid)

TMP_FIELD = BULK_COL_BRAND_VSNONBRAND + '_new'
# blank out tmp field
inputDf[TMP_FIELD] = numpy.nan


# blank out tmp field
inputDf[TMP_FIELD] = numpy.nan

#set up regex for non brand 
pattern = r'non[\s-]?brand'
regex = re.compile(pattern, re.IGNORECASE)



today = datetime.datetime.now(CLIENT_TIMEZONE).date()
print(tableize(inputDf))


inputDf.loc[ (inputDf[RPT_COL_CAMPAIGN].str.contains('brand', case=False)) & (~inputDf[RPT_COL_CAMPAIGN].str.contains(regex)) , TMP_FIELD ] = 'Brand'
inputDf.loc[ (~inputDf[RPT_COL_CAMPAIGN].str.contains('brand', case=False)) | (inputDf[RPT_COL_CAMPAIGN].str.contains(regex)) , TMP_FIELD ] = 'Non-Brand'

print(tableize(inputDf))

print(inputDf.index.duplicated())

# copy new strategy to output
outputDf.loc[:,BULK_COL_BRAND_VSNONBRAND] = inputDf.loc[:, TMP_FIELD]

# only include campaigns with changed strategy in bulk file
outputDf = outputDf[ inputDf[TMP_FIELD].notnull() & (inputDf[BULK_COL_BRAND_VSNONBRAND] != inputDf[TMP_FIELD]) ]


Post generated on 2025-03-11 01:25:51 GMT

comments powered by Disqus