Training
Our training endpoints allow you to programmatically train models based off of datasets you've uploaded to Akkio. They can be viewed as a better-designed v2 of our legacy /v1/models
route.
Training is a fundamentally long operation, so is surfaced through a polling-based mechanism where you make the following calls:
- One call to
/new
to submit the task - Polling calls to
/{task_id}/status
to check up on your task's status - One last call to
/{task_id}/result
once/status
indicates it's done
These endpoints are currently in beta and are subject to change as we refine them.
Example Model
Throughout this page, we'll assume an existing dataset exists in Akkio with fields:
- Job Title
- Years of Experience
- Company Size
- Industry
- Location
- Webinars Attended
- Whitepapers Downloaded
- Pages Visited
- Days Since Last Visit
- Email Open Rate
- Email Click Rate
- Responded to Survey
- Positive Lead
We'll assume that "Positive Lead" is the attribute we're interested in training the model to predict.
Request Payload
We expect, as input, a JSON object with the following fields.
Mandatory:
dataset_id
is the ID of the dataset you'd like to create a model off of.
Optional:
predict_fields
is a list of fields you want to train the model to predict on. This will generally always be exactly one field.ignore_fields
is a list of fields you want to ignore such that they are not factored into the trained model.force
always trains a fresh model, even if another one has already been trained under the exact same parameters.extra_attention
helps improve accuracy on certain datasets where rare cases are common.
Example Call Sequence
Here's a rough outline of how you'd make a request to the Inference bulk predictions endpoints.
HTTP Headers
Header Name | Required | Value |
---|---|---|
X-API-Key | Yes | Your team's API key. See Authentication. |
1. Create Request
First, we'll submit the task into our asynchronous processing queue.
POST /api/v1/models/train/new
{
"dataset_id": "YTV32jCdVf5DbcMxzvX5",
"predict_fields": ["Positive Lead"]
}
You'll receive an object like this containing a task id:
{
"task_id": "<task_id>"
}
We'll use this in the next request.
2. Query for Task Status
Next, we'll query the status endpoint at a cadence to see whether the task is complete yet. This request might look something like this:
GET /api/v1/models/train/<task_id>/status
Note that you must use the same Task ID that you received from the task creation endpoint above.
This will provide you with a status
field set to either SUBMITTED
, IN_PROGRESS
, FAILED
, or SUCCEEDED
. You can read more about each state on the Asynchronous Endpoints page.
Here's an example response you might get:
{
"status": "IN_PROGRESS",
"metadata": {
"type": "IN_PROGRESS"
}
}
You should retry ("poll") this endpoint at a regular cadence until you get a response that looks something like this:
{
"status": "SUCCEEDED",
"metadata": {
"type": "SUCCEEDED",
"location": "/api/v1/models/train/<task_id>/result"
}
}
The location
field is always relative to the API root (https://api.akkio.com/api/v1
), not the overall website root (https://api.akkio.com
). You'll need to remember to construct the end URL from the site name, API root, and the provided location.
Armed with this information, we'll move to the last request.
3. Query for Result
Armed with the location we got from the status call, we'll make a request for the end result.
GET /api/v1/models/train/<task_id>/result
You'll get a response that looks something like this:
{
"status": "success",
"model_id": "AyooDGG4IB7iJDkgLKJR",
"stats": [
[
{
"field": 12,
"field_name": "Positive Lead",
"field_type": "category",
"class": 0,
"class_name": "0",
"count": 19,
"true positives": 19,
"false positives": 0,
"false negatives": 0,
"precision": 1.0,
"recall": 1.0,
"f1": 1.0,
"frequency": 0.95
},
{
"field": 12,
"field_name": "Positive Lead",
"field_type": "category",
"class": 1,
"class_name": "1",
"count": 1,
"true positives": 1,
"false positives": 0,
"false negatives": 0,
"precision": 1.0,
"recall": 1.0,
"f1": 1.0,
"frequency": 0.05
}
]
],
"field_importance": {
"Job Title": 0.09528297930955887,
"Years of Experience": 0.09974530339241028,
"Company Size": 0.08637341856956482,
"Industry": 0.08510678261518478,
"Location": 0.050445690751075745,
"Webinars Attended": 0.1286168396472931,
"Whitepapers Downloaded": 0.10489732772111893,
"Pages Visited": 0.08934492617845535,
"Days Since Last Visit": 0.07931949943304062,
"Email Open Rate": 0.08199360221624374,
"Email Click Rate": 0.05456322059035301,
"Responded to Survey": 0.04431045055389404,
"Positive Lead": 8.27786672541464e-10
},
"data_story": [
{
"name": "Positive Lead",
"type": "category",
"outcomes": [
{
"outcome": "0",
"causes": [
{
"field": "Webinars Attended",
"top_value": "3",
"bottom_value": "3"
},
{
"field": "Whitepapers Downloaded",
"top_value": "3",
"bottom_value": "7"
},
{
"field": "Years of Experience",
"top_value": "6",
"bottom_value": "7"
},
{
"field": "Job Title",
"top_value": "Executive",
"bottom_value": "Assistant"
},
{
"field": "Pages Visited",
"top_value": "27",
"bottom_value": "1"
}
],
"top_case": 1.0,
"avg_case": 0.94,
"bottom_case": 1.0
},
{
"outcome": "1",
"causes": [
{
"field": "Webinars Attended",
"top_value": "3",
"bottom_value": "3"
},
{
"field": "Whitepapers Downloaded",
"top_value": "7",
"bottom_value": "3"
},
{
"field": "Years of Experience",
"top_value": "7",
"bottom_value": "6"
},
{
"field": "Job Title",
"top_value": "Assistant",
"bottom_value": "Executive"
},
{
"field": "Pages Visited",
"top_value": "1",
"bottom_value": "27"
}
],
"top_case": 0.0,
"avg_case": 0.06,
"bottom_case": 0.0
}
]
}
]
}
You can take this result and use it however you wish.
Additional Resources
- OpenAPI Reference: https://api.akkio.com/api/v1/docs
- OpenAPI Specification: https://api.akkio.com/api/v1/api.yaml