도구

오픈 AI가 서비스하는 Code Interpreter 나 Knowledge Retrieval 을 사용할 수 있게 하거나, 내가 만든 도구를 어시스턴트가 function calling 을 통해 사용할 수 있게 하는 개념. 인공지능이 내부와 외부의 도구를 활용하여 단점을 보완하고 할 수 있는 일들이 엄청 많아지게 된다. 도구를 사용하는 인공지능!

코드 인터프리터

코드인터프리터는 Assistants API가 파이썬을 샌드박스 환경에서 작성하고 실행시킬 수 있는 도구이다. 데이터가 있는 파일을 다룰 수 있고 데이터가 있는 파일이나 그래프의 이미지를 만들 수 있다. 또 코드가 실패했을 때 이를 수정해서 다시 시도할 수 있는 능력도 가지고 있다. (자기 수정 기능은 꽤 유행했던 개념)

코드 인터프리터 활성화 하기

Assistant object에 tools 파라미터에 code_interpreter 를 전달하면 된다. 모델은 사용자의 메시지에서 코드 인터프리터를 실행해야 하는 상황일 때 이를 실행한다. 이런 조건은 지침에 (instructions) 추가할 수 있다. (예: 이 문제를 해결하기 위해서 코드를 짜줘. write code to solve this problem.)

assistant = client.beta.assistants.create(
  instructions="You are a personal math tutor. When asked a math question, write and run code to answer the question.",
  model="gpt-4-1106-preview",
  tools=[{"type": "code_interpreter"}]
)

예제에서는 개인 수학 교사를 만드는데 수학 질문을 받으면 코드를 실행해서 답하라고 되어 있다.

코드 인터프리터에 파일 제공하기

코드 인터프리터는 파일을 제공받고 이를 활용할 수 있다. 큰 볼륨의 데이터를 제공하기를 원하거나 사용자가 분석을 위해 파일을 올리기를 원할 때 유용하게 사용할 수 있다.

이번에는 직접 코드로 실습을 하면서 해보자. tool.py 를 만들고 다음과 같이 작성한다.

from openai import OpenAI
import time

client = OpenAI(api_key="YOUR_API_KEY")

# 여기에서 rb는 read binary의 뜻
file = client.files.create(
  file=open("math.csv", "rb"),
  purpose="assistants"
)

assistant = client.beta.assistants.create(
  instructions="You are a personal math tutor. When asked a math question, write and run code to answer the question.",
  model="gpt-4-1106-preview",
  tools=[{"type": "code_interpreter"}],
  file_ids=[file.id]
)

print(assistant)

이렇게 한 후 터미널에서 실행해 보자.

python tool.py

성공적으로 생생되었다면 결과의 assistant 의 id를 따로 복사해 놓자. 이제 이걸 불러서 사용하면 된다.

YOUR_API_KEY 에는 여러분의 키를 넣으면 된다.
file 은 다음과 같이 수학 계산을 해보기 위해서 간단한 csv 파일을 만들었다.

Number1	Number2	Addition	Subtraction	Multiplication	Division
1	6	7	-5	6	0.166667
2	7	9	-5	14	0.285714
3	8	11	-5	24	0.375
4	9	13	-5	36	0.444444
5	10	15	-5	50	0.5
복사해서 엑셀 등에서 붙여넣기 하거나 구글 드라이브에서 다운로드 받자 (math.csv)

참고로 파일은 스레드 레벨에서도 첨부할 수 있다.

thread = client.beta.threads.create(
  messages=[
    {
      "role": "user",
      "content": "I need to solve the equation `3x + 11 = 14`. Can you help me?",
      "file_ids": [file.id]
    }
  ]
)

파일은 최대 512MB의 용량을 가진다. 코드 인터프리터는 csv, pdf, json 등 다양한 파일을 지원한다. 총 지원 파일 리스트

사용해 보기

자 이제 만든 어시스턴트를 사용해 보자.

# 기존 생성 코드는 주석처리
# assistant = client.beta.assistants.create(
#     instructions="You are a personal math tutor. When asked a math question, write and run code to answer the question.",
#     model="gpt-4-1106-preview",
#     tools=[{"type": "code_interpreter"}],
#     file_ids=[file.id],
# )

# print(assistant)

# YOUR_API_KEY에 여러분의 키 입력
assistant = client.beta.assistants.retrieve("YOUR_API_KEY")

thread = client.beta.threads.create()

message = client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content="첨부된 csv 파일의 첫번째 열의 합을 알려줘.",
)

run = client.beta.threads.runs.create(
    thread_id=thread.id,
    assistant_id=assistant.id,
)

# 코드 인터프리터는 느려서 5초마다 한번씩 확인
while True:
    if run.status == "completed":
        break
    run = client.beta.threads.runs.retrieve(thread_id=thread.id, run_id=run.id)
    print("실행 중...")
    time.sleep(5)

messages = client.beta.threads.messages.list(thread_id=thread.id)
print(messages)

전에 했던 것과 마찬가지로 스레드를 생성하고 여기에 메시지를 더한다. 그리고 스레드를 run 한다. run의 상태를 지속적으로 확인한 후 completed 일 경우 메시지에 받은 응답을 확인한다.

나온 결과를 각 메시지별로 인스턴스로 정리해 보겠습니다.

# 각 메시지 인스턴스 생성
message1 = ThreadMessage(
    id='msg_xG6VhzRHgJxHwc6cBoeIYcpS',
    assistant_id='asst_CEVHjX4kiDXKxl6QCmTHZQ8V',
    content=[MessageContentText(text='첫 번째 열의 합계는 15입니다.')],
    created_at=1700375215,
    role='assistant',
    thread_id='thread_zmyzfjGZuFbb2qso8CxaLCQf'
)

message2 = ThreadMessage(
    id='msg_NlW4q6Q6GNu3z5fBOrHDe2gS',
    assistant_id='asst_CEVHjX4kiDXKxl6QCmTHZQ8V',
    content=[MessageContentText(text="CSV 파일이 성공적으로 로드되었으며, 데이터의 구조를 확인할 수 있습니다. 첫 번째 열의 이름은 'Number1'이며, 이 열의 값을 모두 더해 합계를 계산하겠습니다.")],
    created_at=1700375212,
    role='assistant',
    thread_id='thread_zmyzfjGZuFbb2qso8CxaLCQf'
)

message3 = ThreadMessage(
    id='msg_cKpKy8s7dIIopxPFC1ilRzYh',
    assistant_id='asst_CEVHjX4kiDXKxl6QCmTHZQ8V',
    content=[MessageContentText(text='먼저 업로드하신 CSV 파일의 내용을 확인하고, 첫 번째 열의 합을 계산하기 위해 파일을 불러와 보겠습니다.')],
    created_at=1700375206,
    role='assistant',
    thread_id='thread_zmyzfjGZuFbb2qso8CxaLCQf'
)

message4 = ThreadMessage(
    id='msg_3WdaZ5MNhpmVMuCOzCMBqZAC',
    assistant_id=None,
    content=[MessageContentText(text='첨부된 csv 파일의 첫번째 열의 합을 알려줘.')],
    created_at=1700375205,
    role='user',
    thread_id='thread_zmyzfjGZuFbb2qso8CxaLCQf'
)

먼저 질문을 받고 실행 계획을 세우고 코드 인터프리터를 실행하여 결과를 나타내는 것을 확인할 수 있습니다.

첨부된 csv 파일의 첫번째 열의 합을 알려줘.
먼저 업로드하신 CSV 파일의 내용을 확인하고, 첫 번째 열의 합을 계산하기 위해 파일을 불러와 보겠습니다.
CSV 파일이 성공적으로 로드되었으며, 데이터의 구조를 확인할 수 있습니다. 첫 번째 열의 이름은 'Number1'이며, 이 열의 값을 모두 더해 합계를 계산하겠습니다.
첫 번째 열의 합계는 15입니다.

1 + 2+ 3+ 4+ 5 의 결과값이니 15가 맞네요. 똑똑하다 GPT!

코드 인터프리터가 생성한 파일 또는 이미지 읽기

코드 인터프리터는 이미지 다이어그램이나, CSV, PDF 파일등을 결과로 내놓을 수 있다. 다음은 가능한 파일의 유형이다.

이미지
데이터 파일 (예 : Assistant가 생성한 데이터를 가지고 있는 csv)

이렇게 만들어진 결과는 file_id 필드의 값을 활용해서 조회하거나 다운로드 받을 수 있다.

아까 우리의 어시스턴트에 결과를 csv 파일로 생성해 달라고 하자.

첨부된 csv 파일의 첫번째 열의 합을 알려줘. 결과는 csv 파일로 출력해줘.

다음 코드는 나온 결과를 좀 보기 쉽게 정리한 것입니다.

# 파일 경로 어노테이션
file_path_annotation = TextAnnotationFilePath(
    end_index=137,
    file_path=TextAnnotationFilePathFilePath(file_id='file-WUeQ8rOTtHfJOZQIkDCO7YZx'),
    start_index=96,
    text='sandbox:/mnt/data/sum_of_first_column.csv',
    type='file_path'
)

# 메시지 텍스트
message_text = Text(
    annotations=[file_path_annotation],
    value='첫 번째 열의 합이 계산되어 새 CSV 파일로 저장되었습니다. 아래 링크에서 결과 파일을 다운로드할 수 있습니다.\n\n[sum_of_first_column.csv 다운로드](sandbox:/mnt/data/sum_of_first_column.csv)'
)

# ThreadMessage 객체
thread_message = ThreadMessage(
    id='msg_bjYcK10aMQc07mHEh9mEBhLQ',
    assistant_id='asst_CEVHjX4kiDXKxl6QCmTHZQ8V',
    content=[MessageContentText(text=message_text, type='text')],
    created_at=1700379797,
    file_ids=['file-UeQ8rOTtHfJOZQIkDCO7YZx'],
    metadata={},
    object='thread.message',
    role='assistant',
    run_id='run_kD8NN3rBnzstZCcNPKsPpbk6',
    thread_id='thread_ihc9wMtj4jpWfIMAf1Wu64II'
)

결과를 보면 다음과 같이 출력된 것을 알 수 있습니다. 여기에서 보면 file_ids 에 생성된 파일의 아이디가 있는 걸 알 수 있습니다.

이제 이 아이디를 Files API를 활용해 다운로드 받아 보겠습니다. 위의 코드들을 잠시 주석처리하고 다음 코드를 실행해 봅시다.

image_data = client.files.content("YOUR_FILE_ID")
image_data_bytes = image_data.read()

with open("./my-math.csv", "wb") as file:
    file.write(image_data_bytes)

이렇게 하면 폴더에 my-math.csv 가 생성된 것을 확인할 수 있을 것입니다.

그리고 만약 코드 인터프리터가 결과 파일을 링크로 제공을 했다면 message의 annotations (주석)에서 상세 정보를 확인할 수 있습니다.

file_path_annotation = TextAnnotationFilePath(
    end_index=137,
    file_path=TextAnnotationFilePathFilePath(file_id='file-WUeQ8rOTtHfJOZQIkDCO7YZx'),
    start_index=96,
    text='sandbox:/mnt/data/sum_of_first_column.csv',
    type='file_path'
)

코드 인터프리터의 입력과 출력 로그 보기

run에 steps의 list 를 지정함으로써 코드 인터프리터의 로그를 볼 수 있습니다.

마지막에 다음 코드를 추가해서 실행한 결과의 로그값을 확인해 봅시다.

run_steps = client.beta.threads.runs.steps.list(thread_id=thread.id, run_id=run.id)

print(run_steps)

정리한 코드

sync_cursor_page_run_step = {
    "data": [
        # ... 이전 RunStep 객체들 ...
        {
            "id": "step_3B5ONOk5kijxTMy1MyfMisSc",
            "assistant_id": "asst_CEVHjX4kiDXKxl6QCmTHZQ8V",
            # ... 기타 RunStep 속성들 ...
            "step_details": {
                "tool_calls": [
                    {
                        "id": "call_bx6BusL2mvFaxAWmKV6WqmXB",
                        "code_interpreter": {
                            "input": "# Create a new dataframe with the sum\nsum_df = pd.DataFrame([first_column_sum], columns=['Sum of First Column'])\n\n# Define the path for the new CSV file\noutput_file_path = '/mnt/data/sum_of_first_column.csv'\n\n# Save the sum to a new CSV file\nsum_df.to_csv(output_file_path, index=False)\n\noutput_file_path",
                            "outputs": [
                                {
                                    "logs": "'/mnt/data/sum_of_first_column.csv'",
                                    "type": "logs"
                                }
                            ]
                        },
                        "type": "code_interpreter"
                    }
                ],
                "type": "tool_calls"
            },
            # ... 기타 RunStep 속성들 ...
        },
        # ... 나머지 RunStep 객체들 ...
    ],
    "object": "list",
    "first_id": "step_4brrrOz4ycy0uaIGMQe62Pva",
    "last_id": "step_a9Q0QYxj4SZN59Vk2uthUmsq",
    "has_more": False
}

여기에서 code_interpreter 항목에서 inputs 와 outputs 를 확인할 수 있습니다.

input :

Create a new dataframe with the sum

sum_df = pd.DataFrame([first_column_sum], columns=['Sum of First Column'])

Define the path for the new CSV file\noutput_file_path = '/mnt/data/sum_of_first_column.csv'

Save the sum to a new CSV file\nsum_df.to_csv(output_file_path, index=False)

output_file_path

outputs :

"outputs": [
    {
        "logs": "'/mnt/data/sum_of_first_column.csv'",
        "type": "logs"
    }
]

코드 인터프리터의 내용을 보면 데이터 처리 라이브러리인 Pandas의 DataFrame으로 데이터를 올리고 처리하는 것을 알 수 있습니다.

여기까지 Tool 중 대표적인 코드 인터프리터에 대해 알아 보았습니다. 다음은 다른 툴인 retrieval에 대해 알아보겠습니다.

기타

가격은 세션 당 0.03달러입니다. 세션은 한시간동안 유지됩니다. 코드인터프리터 등 OpenAI가 제공하는 도구의 비용은 여기에서 확인할 수 있습니다.

전체 코드

from openai import OpenAI
import time

client = OpenAI(api_key="YOUR_API_KEY")

# file = client.files.create(file=open("math.csv", "rb"), purpose="assistants")

# assistant = client.beta.assistants.create(
#     instructions="You are a personal math tutor. When asked a math question, write and run code to answer the question.",
#     model="gpt-4-1106-preview",
#     tools=[{"type": "code_interpreter"}],
#     file_ids=[file.id],
# )

# print(assistant)

assistant = client.beta.assistants.retrieve("YOUR_ASSISTANT_ID")

thread = client.beta.threads.create()

message = client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content="첨부된 csv 파일의 첫번째 열의 합을 알려줘. 결과는 csv 파일로 출력해줘.",
)

run = client.beta.threads.runs.create(
    thread_id=thread.id,
    assistant_id=assistant.id,
)

print("run을 실행합니다.")
while True:
    if run.status == "completed":
        break
    run = client.beta.threads.runs.retrieve(thread_id=thread.id, run_id=run.id)
    print("실행 중...")
    time.sleep(5)

messages = client.beta.threads.messages.list(thread_id=thread.id)
print(messages)
print("\n\n")
# file_id = messages.data[0].file_ids[0]
# image_data = client.files.content(file_id)
# image_data_bytes = image_data.read()

# with open("./my-math.csv", "wb") as file:
#     file.write(image_data_bytes)

run_steps = client.beta.threads.runs.steps.list(thread_id=thread.id, run_id=run.id)
print(run_steps)

Code Interpreter 도구 사용해 보기 (Tools) - Assistants API 베타 사용법

도구