Skip to main content
Question or a Poll
Q&A (Help Needed)
Marcus WongKeboola TEAM
Headmaster of Keboola Academy, School for Data Witchcraft and Wizardry
Asked a question 2 years ago

How can I add processors to a component configuration without processors UI?

Where am I?

In Keboola you can ask and answer questions and share your experience with others!

I would use the configuration API52 and add that manually. It's actually described in the docs55 as well.

Martin FiserKeboola TEAM
Head of Professional Services @ Keboola

Adding full guide here (from our internal confluence knowledge base, thanks @Michal Hruska @Keboola37 ):

How to add processor

UI

Some components (such as AWS S3 Extractor73 or Azure BLOB Storage Extractor38) support adding a processor directly in the UI configuration.

How can I add processors to a component configuration without processors UI?

JSON Editor

As you can see on the image above the UI offers only selected (most typical) processors for respective component. However, you might need to add additional processors and for that you need to switch the component configuration to JSON Editor. Similarly, some components don’t have the UI support for Processors whatsoever and thus you need to switch to JSON Editor to add any processor to it.

How can I add processors to a component configuration without processors UI?
How can I add processors to a component configuration without processors UI?

API

Some components (such as MySQL DB Extractor Extractor) doesn’t support Processors nor switching to JSON editor in the UI. For such components you need to use an API call to update component’s configuration35 or update row configuration29.

It’s recommended to:

Following examples are in Python

Fetch component’s details29

import requests


COMPONENT_ID = 'keboola.ex-db-mysql'
CONFIG_ID = '640999713'
URL = ''.join(['https://connection.keboola.com/v2/storage/components/43',
              COMPONENT_ID,
              '/configs/',
              CONFIG_ID])
HEADERS = {'Content-Type':'application/json',
           'X-StorageApi-Token': 'your_token'}

RESPONSE = requests.get(url = URL,
                        headers = HEADERS)

RESPONSE.json()

Response:

{
   "id":"640999713",
   "name":"Test Configuration",
   "description":"",
   "created":"2020-10-14T15:44:20+0200",
   "creatorToken":{
      "id":224099,
      "description":"michal.hruska@keboola.com52"
   },
   "version":10,
   "changeDescription":"New configuration",
   "isDeleted":false,
   "configuration":{
      "parameters":{
         "db":{
            "port":3306,
            "ssh":{
               "sshPort":22
            },
            "host":"keboolademo.mysql.database.azure.com38",
            "user":"data_ca@keboolademo",
            "#password":"KBC::ProjectSecure::eJwBQAG//mE6Mjp7aTowO3M6MTA0OiLe9QIAS1CLPk4xE9qXimBNC5BFk9nGYxfhLq1+uyt+PF/gYIfudKBVpFRDYbrBApIEJGH7QUQsk1grmkTNYVkOeCkwkzhxQFKFFBZt4eLpKa34IJrv5OmmxMvIb5jWh59On7Tfwnc91CI7aToxO3M6MTg0OiIBAgMAeGVez2/nHl36SUiQv1vPJofmrO9Ycm3Z8Zb1zMOFZ23AAZT7ostkq9Hl/V/h7eJqjGUAAAB+MHwGCSqGSIb3DQEHBqBvMG0CAQAwaAYJKoZIhvcNAQcBMB4GCWCGSAFlAwQBLjARBAy9cIBDXSJKdfjeD+cCARCAO8cP9sAi7WJXHxXHcSS9uqQcX1XL5tWY9+BK+z9wn4U7VXIwtdFWHLV9wXeRscAz+eQikI5s7bZTt47/Ijt9lXqLEQ==",
            "database":"keboola"
         }
      }
   },
   "rowsSortOrder":[
      
   ],
   "rows":[
      {
         "id":"15095",
         "name":"loan",
         "description":"",
         "configuration":{
            "parameters":{
               "columns":[
                  
               ],
               "primaryKey":[
                  
               ],
               "incremental":false,
               "outputTable":"in.c-keboola-ex-db-mysql-640999713.loan39",
               "table":{
                  "schema":"keboola",
                  "tableName":"loan"
               }
            }
         },
         "isDisabled":false,
         "version":5,
         "created":"2021-01-06T10:57:13+0100",
         "creatorToken":{
            "id":330104,
            "description":"michal.hruska@keboola.com52"
         },
         "changeDescription":"Adding processor",
         "state":{
            
         }
      }
   ],
   "state":{
      
   },
   "currentVersion":{
      "created":"2021-01-06T12:29:37+0100",
      "creatorToken":{
         "id":330104,
         "description":"michal.hruska@keboola.com52"
      },
      "changeDescription":"New configuration"
   }
}

 

2. Update configuration row29

Processors are set up or each configuration row separately in this case, thus we need to update the row configuration to add processors.

See rows 26 onwards in the example:

import json
import requests


COMPONENT_ID = 'keboola.ex-db-mysql'
CONFIG_ID = '640999713'
ROW_ID = '15095'

URL = ''.join(['https://connection.keboola.com/v2/storage/components/43',
              COMPONENT_ID,
              '/configs/',
              CONFIG_ID,
              '/rows/',
              ROW_ID])
HEADERS = {'X-StorageApi-Token':'your_token', 
           'Content-Type': 'application/x-www-form-urlencoded'}


# BASIC CONFIGURATION (Retrieved from previous API call) + PROCESSORS
PARAMS = {}
ROW =   {'parameters': {'columns': [],
             'primaryKey': [],
             'incremental': False,
             'outputTable': 'in.c-keboola-ex-db-mysql-640999713.loan39',
             'table': {'schema': 'keboola', 'tableName': 'loan'}},
             'processors': {
                'after': [
                    {'definition':{'component': 'keboola.processor-create-manifest'},
                   'parameters': {'delimiter': ',',
                    'enclosure': '"',
                    'incremental': False,
                    'primary_key': [],
                    'columns_from': 'header'}},
                  {
                    'definition': {
                        'component': 'keboola.processor-add-row-number-column'
                        },
                    'parameters': {
                        'column_name': 'myRowNumberColumn'
                        }
            }
                  
                  ]}}
PARAMS['changeDescription'] = 'Adding processor'
PARAMS['configuration'] = json.dumps(ROW)


RESPONSE_UPDATE = requests.put(url = URL,
                               headers = HEADERS,
                               data= PARAMS)
RESPONSE_UPDATE.json()

Now, the extractor will be executed and the configured processors will run after it - see the job log:

How can I add processors to a component configuration without processors UI?