It is clear that we need some extra processing steps to convert the JSON-formatted data in strings into Python dictionary.
def info(response): txt = json.loads(response.json())['choices'][0]['message']['content'] data = json.loads(txt.replace('```json\n', "").replace('\n```', ""))return data
info(response)
/tmp/ipykernel_17475/2561472427.py:3: PydanticDeprecatedSince20: The `json` method is deprecated; use `model_dump_json` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.10/migration/
txt = json.loads(response.json())['choices'][0]['message']['content']
Instead of manual transformation, we can try applying GPT structured output to provide a guideline and some suggestions for ouputs. In our application, we have tested with images from city cameras and images from satellite, both of which vastly differ in content with one another. Hence, it might be more relevant if we provide different output formats and field suggestions for GPT separately for each type of image.
This section tests integrating with our current GPT framework. This function can be used in combination with ytlive module:
from llmcam.core.fc import*from llmcam.core.fn_to_schema import function_schemafrom llmcam.vision.ytlive import*tools = [ function_schema(capture_youtube_live_frame, "Youtube Live Capture"), function_schema(select_youtube_live_url, "Youtube Live URL"), function_schema(ask_gpt4v_about_image_file, "GPT4 Vision"),]messages = form_msgs([ ("system", "You are a helpful system administrator. Use the supplied tools to assist the user."), ("user", "Hi, can you capture a YouTube Live? Use the default link.")])complete(messages, tools=tools)print_msgs(messages)
[youtube] Extracting URL: https://www.youtube.com/watch?v=LMZQ7eFhm58
[youtube] LMZQ7eFhm58: Downloading webpage
[youtube] LMZQ7eFhm58: Downloading ios player API JSON
[youtube] LMZQ7eFhm58: Downloading tv player API JSON
[youtube] LMZQ7eFhm58: Downloading m3u8 information
CPL CREME tele
cap_2025.01.16_13:05:52_unclear.jpg
>> System:
You are a helpful system administrator. Use the supplied tools to assist the user.
>> User:
Hi, can you capture a YouTube Live? Use the default link.
>> Assistant:
I have captured an image from the default YouTube Live stream. You can find it stored at the
following path: `/home/nghivo/tinyMLaaS/llmcam/data/cap_2025.01.16_13:05:52_unclear.jpg`.
# Continue the conversation and ask about the image filemessages.append(form_msg("user", "Can you extract information about this image?"))complete(messages, tools=tools)print_msgs(messages)
>> System:
You are a helpful system administrator. Use the supplied tools to assist the user.
>> User:
Hi, can you capture a YouTube Live? Use the default link.
>> Assistant:
I have captured an image from the default YouTube Live stream. You can find it stored at the
following path: `/home/nghivo/tinyMLaaS/llmcam/data/cap_2025.01.16_13:05:52_unclear.jpg`.
>> User:
Can you extract information about this image?
>> Assistant:
Here's the information extracted from the captured image: - **Timestamp**: January 16, 2025, 12:56
PM - **Location**: Valkosaari - **Image Dimensions**: 1280x720 - **Buildings**: There are 5
buildings visible in the image. - **Buildings Height Range**: 2-4 stories - **Water Bodies
Visible**: Yes - **Type of Water Bodies**: Lake - **Sky Visible**: Yes - **Sky Light Conditions**:
Clear - **Visibility**: Clear - **Time of Day**: Afternoon - **Artificial Lighting**: Low Feel free
to ask if you need more details or further assistance!
Another scenario using satellite live:
messages = form_msgs([ ("system", "You are a helpful system administrator. Use the supplied tools to assist the user."), ("user", "Capture an image from satellite and extract information about it.")])complete(messages, tools=tools)print_msgs(messages)
[youtube] Extracting URL: https://www.youtube.com/watch?v=xRPjKQtRXR8
[youtube] xRPjKQtRXR8: Downloading webpage
[youtube] xRPjKQtRXR8: Downloading ios player API JSON
[youtube] xRPjKQtRXR8: Downloading tv player API JSON
[youtube] xRPjKQtRXR8: Downloading m3u8 information
>> System:
You are a helpful system administrator. Use the supplied tools to assist the user.
>> User:
Capture an image from satellite and extract information about it.
>> Assistant:
The captured satellite image from the YouTube Live stream provides the following information: -
**Timestamp:** 2025-01-16 at 11:06:44 UTC - **Location:** South Atlantic Ocean - **Latitude:**
-45.07 - **Longitude:** 3.68 - **Image Dimensions:** 1280x720 pixels - **Satellite Name:**
International Space Station - **Sensor Type:** Optical - **Cloud Cover Percentage:** 80% - **Land
Cover Types Detected:** Ocean, Cloud - **Waterbodies Detected:** Yes, specifically Ocean - **Urban
Areas Detected:** No - **Lightning Detected:** No - **Thermal Anomalies Detected:** No - **Nighttime
Imaging:** No - **Minutes to Sunset:** 28 minutes - **Satellite Speed:** 7.6507 km/s - **Satellite
Altitude:** 433.2 km This image mainly shows the ocean with significant cloud cover and no urban
areas or thermal anomalies.