![]() | #1 |
Supreme Warrior Overlord War Room Member Join Date: 2010 Location: Manila, Philippines
Posts: 1,927
Thanks: 778
Thanked 1,104 Times in 641 Posts
| Microsoft Said GPT-4 Allows Processing & Generation of Text, Images, Audio & Videos • "Microsoft explained that GPT-4 would be 'multimodal'. Holger Kenn, Director of Business Strategy at Microsoft Germany, explained that this would allow the company’s AI to translate a user’s text into images, music, and video." source: https://www.digitaltrends.com/comput...eek-ai-videos/ What is "Multimodal" in Deep Learning Context? • "multimodality" refers to the ability of a Deep Learning model to process different types of digital content as inputs, to generate outputs also as different digital media types, or both — Be it during model training and / or during production deployment. |
• Deep Learning & Machine Vision Engineer: ARIA Research (Sydney, AU) • Founder: Grayscale (Manila, PH) & SEO Campaign Manager: Kiteworks, Inc. (SF, US) Last edited on 14th Mar 2023 at 12:47 PM. | |
![]() | ![]() |
The Following 3 Users Say Thank You to Marx Vergel Melencio For This Useful Post: |
![]() | #2 |
Supreme Warrior Overlord War Room Member Join Date: 2010 Location: Manila, Philippines
Posts: 1,927
Thanks: 778
Thanked 1,104 Times in 641 Posts
| Microsoft's Cosmos1 MLLM (Multimodal Large Language Model) • PAPER (Technical Deep Learning Model Architecture, Training, Validation & Testing Details): https://arxiv.org/pdf/2302.14045.pdf QUOTE: The latest milestone in OpenAI’s effort in scaling up deep learning. GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that, while less capable than humans in many real-world scenarios, exhibits human-level performance on various professional and academic benchmarks. /QUOTE • Paper: https://cdn.openai.com/papers/gpt-4.pdf • Wahtch Developer Demo Livestream Here (04:00 March 15 Philippine Standard Time): https://youtube.com/live/outcGtbnMuQ?feature=share source: https://openai.com/research/gpt-4 |
• Deep Learning & Machine Vision Engineer: ARIA Research (Sydney, AU) • Founder: Grayscale (Manila, PH) & SEO Campaign Manager: Kiteworks, Inc. (SF, US) Last edited on 14th Mar 2023 at 01:14 PM. | |
![]() | ![]() |
The Following User Says Thank You to Marx Vergel Melencio For This Useful Post: |
![]() | #3 |
New Warrior Member Join Date: 2023
Posts: 2
Thanks: 0
Thanked 1 Time in 1 Post
|
Very much excited to use ChatGPT-4 multimodal model.
|
![]() | ![]() |
The Following User Says Thank You to Sameeksha Medewar For This Useful Post: |
![]() | #4 | |
VIP Warrior War Room Member Join Date: 2011 Location: New Jersey
Posts: 2,490
Thanks: 3,584
Thanked 2,274 Times in 1,504 Posts
Blog Entries: 6 | ![]() Re: 03-14 Update: GPT4's Here [ OpenAI's GPT-4 to Launch Next Week as "Multimodal" Network ]
Update - Now you can watch the Demo any time. Just give it a few seconds to load.
| |
Last edited on 15th Mar 2023 at 12:48 PM. | ||
![]() | ![]() |
The Following 2 Users Say Thank You to DWolfe For This Useful Post: |
![]() | #5 |
VIP Warrior War Room Member Join Date: 2011 Location: New Jersey
Posts: 2,490
Thanks: 3,584
Thanked 2,274 Times in 1,504 Posts
Blog Entries: 6 | ![]() Re: 03-14 Update: GPT4's Here [ OpenAI's GPT-4 to Launch Next Week as "Multimodal" Network ]
Marx, how did you like the Demonstration? Do you see any specific things that would help a new member with their Marketing?
|
![]() | ![]() |
The Following User Says Thank You to DWolfe For This Useful Post: |
![]() | #6 | |
Supreme Warrior Overlord War Room Member Join Date: 2010 Location: Manila, Philippines
Posts: 1,927
Thanks: 778
Thanked 1,104 Times in 641 Posts
| ![]() Re: 03-14 Update: GPT4's Here [ OpenAI's GPT-4 to Launch Next Week as "Multimodal" Network ]
@DWolfe,
But 3 things look promising (based solely on what was presented at their dev demo): 1) Improved Factual Correctness; 2) Less "Hallucinations"; and 3) Bigger Context Limits / Single API Call ... Notes • Third one can also take care of the other two, in case they're just hyping up "improved factual correctness" and "less 'hallucinations'". ![]() ** For example, in one API call — We can ask GPT4 to create a report after analyzing a relevant page (in any language) with suitable niche content depth from a trusted source we supply, such as say a whitepaper or case study from a research group at a reputable university, or a government site, or Google Trends for a certain niche topic and keyword, etc. — We can even supply it with multiple pages / content sources to analyze; and ** So lots of ideas are there already for programmatic content development, real time translation and assisted data analytics ... • Because previous 2 to 3 prompts may not consume expanded token limit for succeeding API call — This allows us devs to implement contextual memory, i.e. We can then do something programmatic for GPT4 to remember our previous input prompts and continue with the task while following succeeding prompts ... ** This can be useful for internal tool dev in content dev, real time translation and assisted data analytics, as well as for customer-facing tools like virtual agents for customer support, content moderation and management ... • And, I'm hoping they release their image processing features soon, which they presented in their dev demo livestream — For now, this is just with their partner (BeMyEyes, an app for the blind) ... ** This can be quite useful, i.e. "As an expert data analyst in the field of [enter field here], analyze this graph. Convert the data into a format that provides granular control over data points, such as a spreadsheet. Also provide recommendations and notes about the data, which can be helpful for [enter your objective here]." P.S. OpenAI's saying there are two larger token limit options for GPT4 API. First is 8K, which, considering average system and user prompts total to 300 to 400++ words, is around 1 to 1.2K++ words (based on my tests with multiple GPT Davinci v3.5 API calls just to hit this content depth). Second is 32K tokens (for a limited group in waiting list), which is around 4 to 4.8K+ words with the same average total words for system and user prompts ... | |
• Deep Learning & Machine Vision Engineer: ARIA Research (Sydney, AU) • Founder: Grayscale (Manila, PH) & SEO Campaign Manager: Kiteworks, Inc. (SF, US) Last edited on 17th Mar 2023 at 01:35 AM. Reason: added GPT4 API token limits, then corrected after reviewing notes (instead of relying on memory, which I did earlier) | ||
![]() | ![]() |
The Following User Says Thank You to Marx Vergel Melencio For This Useful Post: |
Bookmarks |
Tags |
gpt4, launch, multimodal, network, openai, week |
| |