OpenAI is set to release their latest model, GPT-4, in mid-March 2023
The model is multimodal, which means it can process various forms of input, including text, images, video, and sound. GPT-3 and GPT-3.5 were only capable of operating in one modality, text, making the upgrade to multimodal a significant development. Microsoft Germany CTO, Andreas Braun, confirmed that GPT-4 would be available in the second week of March and emphasized that it would be able to work in multiple modalities.
Multimodal AI allows
for a more comprehensive understanding of input and enables the model to
provide more nuanced responses. Microsoft's Kosmos-1 model, released earlier in
March, is multimodal and integrates text and images. GPT-4 takes this a step
further by adding video and sound modalities. The model is also capable of
working across all languages, allowing it to receive a question in one language
and answer in another.
One notable aspect of
GPT-4 is its ability to ground AI with facts, making it more reliable.
Microsoft is working on "confidence metrics" to achieve this. While
there is no announcement yet on where GPT-4 will be used, Azure-OpenAI was
specifically mentioned. Google, on the other hand, has been struggling to keep
up with Microsoft's AI technology, as it is not as visible to consumers. This
development further underscores Google's lack of leadership in consumer-facing
AI.
In summary, GPT-4 is set to revolutionize the field of AI with its multimodal capabilities, which will allow it to process a range of input types, including text, images, video, and sound. The model's ability to work across all languages and ground AI with facts makes it a reliable source of information. Microsoft's implementation of AI technology has captured the attention of consumers and reinforced the perception that Google is lagging behind.