gpt-4o api 사용법

GPT-4o

gpt4o는 텍스트, 오디오, 이미지를 input과 output으로 처리할 수 있는 멀티모달 모델입니다. 이 모델을 api로 사용할 수 있게 오픈AI에서 API로 공개했습니다.

특징

GPT-4o는 GPT-4 Turbo보다 2배 빠르게 토큰을 생성합니다.

GPT-4 Turbo보다 50% 저렴합니다. 입력 토큰은 백만 개당 $5, 출력 토큰은 백만 개당 $15입니다.

Model	Input	Output
gpt-4o	$5.00 / 1M Tokens	$15 / 1M Tokens
128k의 컨텍스트 윈도우와 2023년 10월(gpt-4o-2024-05-13)까지의 지식 학습.

분당 최대 1000만개의 토큰을 지원하여 GPT-4 Turbo의 5배 한도를 제공합니다.

GPT-4o는 대부분의 작업에서 향상된 비전 기능을 제공합니다.

GPT-4o는 비영어 텍스트를 효율적으로 처리하는 새로운 토크나이저를 사용하여 비영어 언어 기능이 향상되었습니다.

현재 지원 모달리티

비디오 이해 : 초당 2~4프레임으로 변환하여 비디오 이해. 이미지 및 오디오로 변경하여 작업. 자세한 것은 gpt4o 쿡북에서 확인 가능합니다.

오디오 : 아직 지원하지 않음. 몇 주 내에 신뢰할 수 있는 테스터들에게 먼저 공개할 예정.

이미지 생성 : 현재 지원하지 않으며 지원할 예정임.

플레이그라운드에서 테스트 해볼 수 있습니다. https://platform.openai.com/playground/chat?models=gpt-4o

코드 예시 (파이썬)

터미널에서 openai 업그레이드

%pip install --upgrade openai --quiet

app.py

from openai import OpenAI 
import os

## Set the API key and model name
MODEL="gpt-4o"
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY", "<your OpenAI API key if not set as an env var>"))

completion = client.chat.completions.create(
  model=MODEL,
  messages=[
    {"role": "system", "content": "You are a helpful assistant. Help me with my math homework!"}, # <-- This is the system message that provides context to the model
    {"role": "user", "content": "Hello! Could you solve 2+2?"}  # <-- This is the user message for which the model will generate a response
  ]
)

print("Assistant: " + completion.choices[0].message.content)

텍스트는 특별히 달라진게 없고 모델명을 gpt-4o로 하면 됩니다.

이미지, 동영상, 오디오 (현재는 whisper-1) 처리 예시를 보시고 싶으면 쿡북을 참고해 주세요. Introduction to gpt-4o | OpenAI Cookbook