YAML Configuration

Astonish uses YAML files to define agentic flows. This document provides a comprehensive reference for the YAML configuration format, including all available fields, their meanings, and examples.

Overview

An Astonish agent is defined by a YAML file with the following structure:

description: A brief description of what the agent does
nodes:
  - name: node1
    type: input
    # ... node configuration ...
  - name: node2
    type: llm
    # ... node configuration ...
  # ... more nodes ...
flow:
  - from: START
    to: node1
  - from: node1
    to: node2
  # ... more flow connections ...
  - from: nodeN
    to: END

Top-Level Fields

`description`

A string that describes the purpose and functionality of the agent.

description: An agent that searches the web for information and summarizes the results

`nodes`

An array of node objects that define the steps in the workflow. Each node represents a specific action or decision point in the agent's execution.

nodes:
  - name: get_query
    type: input
    # ... node configuration ...
  - name: search_web
    type: llm
    # ... node configuration ...

`flow`

An array of connection objects that define how nodes are connected. Each connection specifies a source node (from) and a destination node (to), optionally with conditions for branching paths.

flow:
  - from: START
    to: get_query
  - from: get_query
    to: search_web
  - from: search_web
    to: END

Node Configuration

Common Fields

These fields are available for all node types:

`name`

A unique identifier for the node. This is used to reference the node in the flow.

name: get_user_query

`type`

The type of the node. Must be one of "input", "llm", or "tool".

type: input

`prompt`

The text to display to the user (for input nodes) or send to the AI model (for LLM nodes). Can include variables from the state using curly braces.

prompt: |
  What would you like to search for?

`output_model`

Defines the variable names and types for the node's output. The variables will be added to the state and can be used by other nodes.

output_model:
  search_query: str
  results_count: int

`user_message`

An array of variable names to display to the user after the node is processed. The variables must be defined in the output_model.

user_message:
  - search_results

Input Node Fields

`options`

An array of predefined options for the user to choose from. If provided, the user will be presented with a selection menu instead of a free-form input field.

options:
  - "Option 1"
  - "Option 2"
  - "Option 3"

LLM Node Fields

`system`

The system message to send to the AI model. This is used to set the context and behavior of the AI.

system: |
  You are a helpful assistant that provides concise and accurate information.

`parallel`

A configuration object for enabling parallel processing in the LLM node. This allows the node to process multiple items concurrently, improving efficiency for tasks that can be parallelized.

parallel:
  forEach: "{list_variable}"
  as: "item_name"
  maxConcurrency: 30

forEach: Specifies the state variable (enclosed in curly braces) containing the list of items to process in parallel.
as: Defines the name to use for each item in the parallel processing context.
maxConcurrency: Sets the maximum number of concurrent operations allowed.

`output_action`

Specifies how the output of parallel processing should be handled. When using parallelism, only "append" is currently supported to add results to a list.

output_action: append

`tools`

A boolean indicating whether the node can use tools. If true, the node will be able to use tools specified in tools_selection.

tools: true

`tools_selection`

An array of tool names that the node can use. The tools must be available in the system.

tools_selection:
  - read_file
  - web_search

`tools_auto_approval`

A boolean indicating whether tool usage requires user approval. If false, the user will be prompted to approve each tool usage.

tools_auto_approval: false

`raw_tool_output`

An object mapping state variable names to types for storing raw tool output directly in the state. This is useful for large or complex tool outputs that you don't want the LLM to process.

raw_tool_output:
  pr_diff: str

`print_state`

A boolean indicating whether to print the state after the node is processed. Useful for debugging.

print_state: true

`print_prompt`

A boolean indicating whether to print the prompt sent to the AI model. Useful for debugging.

print_prompt: true

`limit`

An integer specifying the maximum number of times the node can be executed in a loop. Used in conjunction with limit_counter_field.

limit: 5

`limit_counter_field`

The variable name for the loop counter. The counter is incremented each time the node is executed and reset when it reaches the limit.

limit_counter_field: iteration_count

Tool Node Fields

`args`

An object mapping argument names to values for the tool. Values can be literals or references to state variables using curly braces.

args:
  file_path: "/path/to/file.txt"
  content: {generated_content}

`tools_selection`

An array of tool names that the node can use. The first tool in the list will be executed.

tools_selection:
  - chunk_pr_diff

Flow Configuration

Basic Connections

The simplest form of connection is a direct link from one node to another:

flow:
  - from: node1
    to: node2

Special Nodes

There are two special nodes in the flow:

START: The entry point of the flow
END: The exit point of the flow

flow:
  - from: START
    to: first_node
  - from: last_node
    to: END

Conditional Edges

For branching paths, you can use conditional edges with lambda functions:

flow:
  - from: check_condition
    edges:
      - to: path_a
        condition: "lambda x: x['condition'] == True"
      - to: path_b
        condition: "lambda x: x['condition'] == False"

The lambda function takes the state dictionary as input and returns a boolean indicating whether the edge should be followed.

Variable Interpolation

You can include variables from the state in prompts using curly braces:

prompt: |
  Generate a response to the user's query: {search_query}
  
  Previous results:
  {previous_results}

Complete Examples

Example 1: PR Review Agent

Here's a complete example of an agent that reviews a pull request using both LLM and Tool nodes:

description: PR Review Agentic Flow
nodes:
  - name: list_prs
    type: llm
    system: |
      You are a GitHub CLI expert. Your task is to list all open pull requests in the current repository.
    prompt: |
      Use the 'gh pr list' command to list all open pull requests.

      Format the output as:
      123: Title of PR 1
      456: Title of PR 2
      789: Title of PR 3
    output_model:
      pr_list: list
    tools: true
    tools_selection:
      - shell_command

  - name: select_pr
    type: input
    prompt: |
      Please select a pull request from the list below by entering its number:
      {pr_list}
    output_model:
      selected_pr: str
    options: [pr_list]

  - name: get_pr_diff
    type: llm
    system: |
      You are a GitHub CLI expert. Your task is to use the 'gh' command to retrieve the diff for a specific pull request.
    prompt: |
      Use the 'gh pr diff' command to get the diff for PR number {selected_pr}.
      IMPORTANT: The tool will return the raw diff. Your final task for this step is to confirm its retrieval.
    output_model:
      retrieval_status: str
    tools: true
    tools_selection:
      - shell_command
    raw_tool_output:
      pr_diff: str

  - name: chunk_pr
    type: tool
    args:
      diff_content: {pr_diff}
    tools_selection:
      - chunk_pr_diff
    output_model:
      pr_chunks: list
      current_index: int

  - name: review_chunk
    type: llm
    system: |
      You are a code review assistant. Your task is to review a chunk of code and provide feedback.
    prompt: |
      Review the following chunk of code:
      {pr_chunks[current_index]}
      Provide your feedback on this chunk.
    output_model:
      chunk_review: str

  - name: collect_reviews
    type: llm
    prompt: |
      collected_reviews:
      {collected_reviews}

      Append the following review to the collected reviews:
      {chunk_review}
    output_model:
      collected_reviews: list

  - name: increment_index
    type: llm
    prompt: |
      Increment current_index: {current_index}. Output current_index + 1.
    output_model:
      current_index: int

  - name: show_reviews
    type: llm
    system: |
      You are a summarization assistant. Your task is to present the collected reviews to the user.
    prompt: |
      Here are the reviews for the pull request:
      {collected_reviews}
    output_model:
      final_summary: str
    user_message:
      - final_summary

flow:
  - from: START
    to: list_prs
  - from: list_prs
    to: select_pr
  - from: select_pr
    to: get_pr_diff
  - from: get_pr_diff
    to: chunk_pr
  - from: chunk_pr
    to: review_chunk
  - from: review_chunk
    to: collect_reviews
  - from: collect_reviews
    to: increment_index
  - from: increment_index
    edges:
      - to: review_chunk
        condition: "lambda x: x['current_index'] < len(x['pr_chunks'])"
      - to: show_reviews
        condition: "lambda x: not x['current_index'] < len(x['pr_chunks'])"

Example 2: Parallel PR Review Agent

This example demonstrates how to use the parallelism feature to review multiple chunks of a PR simultaneously:

description: Parallel PR Review Agentic Flow
nodes:
  - name: list_prs
    type: llm
    system: |
      You are a GitHub CLI expert. Your task is to list all open pull requests in the current repository.
    prompt: |
      Use the 'gh pr list' command to list all open pull requests.

      Format the output as:
      123: Title of PR 1
      456: Title of PR 2
      789: Title of PR 3
    output_model:
      pr_list: list
    tools: true
    tools_selection:
      - shell_command

  - name: select_pr
    type: input
    prompt: |
      Please select a pull request from the list below by entering its number:
      {pr_list}
    output_model:
      selected_pr: str
    options: [pr_list]

  - name: get_pr_diff
    type: llm
    system: |
      You are a GitHub CLI expert. Your task is to use the 'gh' command to retrieve the diff for a specific pull request.
    prompt: |
      Use the 'gh pr diff' command to get the diff for PR number {selected_pr}.
      IMPORTANT: The tool will return the raw diff. Your final task for this step is to confirm its retrieval.
    output_model:
      retrieval_status: str
    tools: true
    tools_selection:
      - shell_command
    raw_tool_output:
      pr_diff: str

  - name: chunk_pr
    type: tool
    args:
      diff_content: {pr_diff}
    tools_selection:
      - chunk_pr_diff
    output_model:
      pr_chunks: list

  - name: review_chunks
    type: llm
    parallel:
      forEach: "{pr_chunks}"
      as: "chunk"
      maxConcurrency: 5
    output_action: append
    system: |
      You are a code review assistant. Your task is to review a chunk of code and provide feedback.
    prompt: |
      Review the following chunk of code:
      {chunk}
      Provide your feedback on this chunk.
    output_model:
      chunk_reviews: list

  - name: summarize_reviews
    type: llm
    system: |
      You are a summarization assistant. Your task is to present a summary of the code reviews to the user.
    prompt: |
      Here are the reviews for the pull request:
      {chunk_reviews}
      
      Summarize the key points from these reviews.
    output_model:
      final_summary: str
    user_message:
      - final_summary

flow:
  - from: START
    to: list_prs
  - from: list_prs
    to: select_pr
  - from: select_pr
    to: get_pr_diff
  - from: get_pr_diff
    to: chunk_pr
  - from: chunk_pr
    to: review_chunks
  - from: review_chunks
    to: summarize_reviews
  - from: summarize_reviews
    to: END

In this parallel version:

The chunk_pr node splits the PR diff into chunks.
The review_chunks node uses the parallel configuration to process all chunks concurrently, with a maximum of 5 concurrent operations.
The output_action: append ensures that all chunk reviews are collected into a list.
The summarize_reviews node then summarizes all the collected reviews.

This parallel approach can significantly speed up the review process for large PRs by processing multiple chunks simultaneously.

Best Practices

Use descriptive names: Give nodes clear, descriptive names that indicate their purpose
Keep prompts focused: Each node should have a specific purpose and a focused prompt
Use system messages: Set appropriate system messages for LLM nodes to guide the AI's behavior
Validate user input: Use input nodes with options to restrict user input to valid choices
Handle errors: Use conditional edges to handle potential errors in the flow
Use tools judiciously: Only enable tools that are necessary for the node's function
Document your YAML: Add comments to explain complex parts of the configuration
Test thoroughly: Test your agent with various inputs to ensure it behaves as expected

Validation

Astonish validates YAML configurations against a schema to ensure they are well-formed. You can use the validate_yaml_with_schema tool to validate your configurations:

- name: validate_config
  type: llm
  prompt: |
    Validate the following YAML configuration against the schema:
    
    Configuration:
    {yaml_content}
    
    Schema:
    {yaml_schema}
  output_model:
    validation_result: str
  tools: true
  tools_selection:
    - validate_yaml_with_schema

Next Steps

To learn more about YAML configuration in Astonish, check out:

Agentic Flows for an overview of how nodes are connected
Nodes for details on node types and configuration
Tools for information on using tools in your agents
Tutorials for examples of creating agents

Overview​

Top-Level Fields​

description​

nodes​

flow​

Node Configuration​

Common Fields​

name​

type​

prompt​

output_model​

user_message​

Input Node Fields​

options​

LLM Node Fields​

system​

parallel​

output_action​

tools​

tools_selection​

tools_auto_approval​

raw_tool_output​

print_state​

print_prompt​

limit​

limit_counter_field​

Tool Node Fields​

args​

tools_selection​

Flow Configuration​

Basic Connections​

Special Nodes​

Conditional Edges​

Variable Interpolation​

Complete Examples​

Example 1: PR Review Agent​

Example 2: Parallel PR Review Agent​

Best Practices​

Validation​

Next Steps​

Overview

Top-Level Fields

`description`

`nodes`

`flow`

Node Configuration

Common Fields

`name`

`type`

`prompt`

`output_model`

`user_message`

Input Node Fields

`options`

LLM Node Fields

`system`

`parallel`

`output_action`

`tools`

`tools_selection`

`tools_auto_approval`

`raw_tool_output`

`print_state`

`print_prompt`

`limit`

`limit_counter_field`

Tool Node Fields

`args`

`tools_selection`

Flow Configuration

Basic Connections

Special Nodes

Conditional Edges

Variable Interpolation

Complete Examples

Example 1: PR Review Agent

Example 2: Parallel PR Review Agent

Best Practices

Validation

Next Steps