Computer_Vision.md

A curated index of impactful AI tools and models, that emphasizes technical merit, practical utility and Prioritizing open-source.

Effective AI use requires understanding capabilities, limitations, and bias mitigation strategies.

⬅️ Back to the Main Page

Computer Vision

Computer Vision (CV) frameworks implement neural architectures for visual data processing, analysis, and synthesis across image and video domains.

Caution

Use AI-generated images responsibly: Always disclose that they were created by AI. Be mindful of intellectual property rights.

Tip

Learn prompt engineering techniques for image generation models to enhance output quality and artistic control. Follow @nickfloats on 𝕏 for valuable insights on crafting prompts that achieve your desired visual outputs.

Image Editing

This section highlights tools and models that assist in modifying and enhancing existing images using AI-powered capabilities.

Image Editing Models

Note

The models are ranked according to their Elo scores (Higher score is better) from the artificialanalysis.ai Text-to-Image Arena Please note that Elo scores are subject to change based on user votes and will be updated regularly to reflect the latest rankings.

To provide a comprehensive overview of the generative image model landscape, only pre-trained versions of the listed models are included in this ranking.

Due to the continuous evolution and vast number of possible fine-tuned configurations, it is impractical to comprehensively list every variant here.

Organization	Model Name	Elo score
	GPT-4o	1094
	Flux.1 Kontext (pro)	1078
	Flux.1 Kontext (max)	1077
	Flux.1 Kontext (dev)	1008
	Bagel	928
	Step1X-Edit	864
HiDream	HiDream-E1-Full	857

Cloud-based Image Editing Providers

This subsection lists platforms that offer image Editing capabilities as a cloud service, typically accessible via web interfaces or APIs, without requiring local model deployment.

Tool	Description	Licence	Pricing
BRIA AI	An AI-powered model to automatically remove backgrounds from images.
Clarity AI	AI Image Upscaler & Enhancer - free and open-source Magnific Alternative
ImageFX	An AI-powered tool for applying various image effects and filters.
Lensa	An AI-powered mobile app for editing and enhancing photos, particularly for portrait editing.
Luminar Neo	An AI-powered photo editing software developed by Skylum.
Magnific AI	an AI-powered image upscaler and enhancer designed for professionals and enthusiasts in photography, graphic design, digital art, and illustration.
Pixlr	An AI-powered online photo editing tool.
Removebg	An online tool that allows users to automatically remove backgrounds from images.
ZMO AI	Comprehensive online platform offering AI-powered image editing tools. Features include background removal, object erasure, image enhancement, and creative modifications.

Image Generation

Explore tools and models designed to create novel images from textual descriptions or other inputs, leveraging the power of generative AI.

Image Generation Models

Note

The models are ranked according to their Elo scores (Higher score is better) from the artificialanalysis.ai Text-to-Image Arena and Imgsys.org Ranking. Please note that Elo scores are subject to change based on user votes and will be updated regularly to reflect the latest rankings.

To provide a comprehensive overview of the generative image model landscape, only pre-trained versions of the listed models are included in this ranking.

Due to the continuous evolution and vast number of possible fine-tuned configurations, it is impractical to comprehensively list every variant here.

Organization	Model Name	Elo score
	Seedream 3.0	1166
	GPT-4o	1165
	Imagen 4 Ultra	1150
	Imagen 4	1145
	FLUX.1-Kontext [Pro]	1127
	Recraft V3	1114
	Qwen-Image	1099
	Flux.1 Kontext	1098
	Imagen 3	1097
	Ideogram 3.0	1093
	Flux1.1 Pro	1083
Reve AI	Reve Image 1.0	1090
HiDream	HiDream-I1-Dev	1078
	Flux.1 Pro	1067
	MiniMax Image-01	1052
	Midjourney v6.1	1047
	Flux.1 Dev	1046
	Ideogram v2	1043
	Midjourney v7 Alpha	1039
	Midjourney v6	1038
	Ideogram v2 Turbo	1033
	Photon	1033
	Stable Diffusion 3.5 Large Turbo	1030
	Stable Diffusion 3.5 Large	1026
	Infinity 8B	1021
	Ideogram v1	1021
	Stable Diffusion 3 Large	1014
	Flux.1 schnell	1000
	Playground v3 (beta)	997
	Recraft 20B	976
	Photon Flash	996
	Playground v2.5	954
	Lumina Image v2	950
	Firefly Image 3	942
	DALLE 3 HD	941
	Stable Diffusion 3.5 medium	932
	DALLE 3	926
	Stable Diffusion 3 Medium	902
	Stable Diffusion 3 Large Turbo	897
	Stable Diffusion 1.6	885
	Stable Diffusion XL base 1.0	849
	DALLE 2	714
	Stable Diffusion 2.1	712
	Stable Diffusion 1.5	625

Cloud-based Image Generation Providers

This subsection lists platforms that offer image generation capabilities as a cloud service, typically accessible via web interfaces or APIs, without requiring local model deployment.

Tool	Description	Licence	Pricing
Craiyon	An AI-powered platform for generating artistic images and animations.
Dall-E	An AI model developed by OpenAI that generates images from textual descriptions.
Fal.ai	Fal.ai is a cutting-edge generative media platform designed for developers to build advanced AI applications.
Firefly	A creative AI tool for generating images, animations, and other visual content.
Ideogram	An advanced text-to-image generator that creates high-quality images based on text prompts.
Krea	An advanced AI-powered platform designed for generating and enhancing visual content, including images and videos.
Lexica	An AI art platform that generates images from textual descriptions.
Leonardo	An open-source AI model for generating images from textual descriptions.
Midjourney	A world-famous AI platform that generates images and visual content based on user input.
Nightcafe	An open-source AI art platform that generates images from textual descriptions using deep learning models.
Picasso	An AI-powered platform for generating images and animations, developed by NVIDIA.
Stable diffusion	An open-source AI model for generating images from textual descriptions using diffusion-based generative models.

Local Image Generation Providers

Tip

Generate images locally - Deploy open-source image generation models on your hardware with our How to run Image Generation on your Machine tutorial.

Tool	Description	OS	Models
ComfyUI	A powerful and modular graphical user interface (GUI) for Stable Diffusion, provide users with precise control over image generation workflows.	All	All Stable Diffusion Models + Flux.1
Diffusion Bee	A free, offline AI art generation tool designed specifically for macOS users.	MacOS/IOS	All Stable Diffusion Models.
Draw Things	A free AI-assisted image generation app available for iOS devices, including iPhones and iPads.	MacOS/IOS	All Stable Diffusion Models.
Fooocus	An open-source AI image generation tool designed to simplify the process of creating images using Stable Diffusion technology.	All	Stable Diffusion XL models.
Invoke	A leading creative engine for Stable Diffusion models.	All	All Stable Diffusion Models.
Stable Diffusion web UI by Automatic1111	a popular graphical user interface (GUI) for interacting with the Stable Diffusion models.	All	All Stable Diffusion Models.

⬆️ Back to Top

Video Generation

Note

Video generation technology remains primarily concentrated among major AI research organizations, with models like OpenAI's Sora and Runway's Gen3 leading development. Current publicly available implementations are limited due to the computational complexity and proprietary nature of these systems.

This section will be updated as more open-source and accessible video generation models emerge.

Image-to-Video Models

Image-to-video models employ temporal diffusion algorithms to synthesize video sequences from static image inputs, generating coherent motion patterns and frame transitions.

Organization	Model Name	Licence	Pricing
	CogVideoX-5B-I2V
	img2vid-xt
	sv3d
	sv4d
Lightricks	LTX-Video
	Wan2.1-I2V-14B-720P

Text-to-Video Models

Text-to-video models convert natural language descriptions into video sequences through multi-modal generation frameworks, synthesizing temporal and spatial elements from textual inputs.

Note

The models are ranked according to their Elo scores (Higher score is better) from the artificialanalysis.ai Video Generation Arena. Please note that Elo scores are subject to change based on user votes and will be updated regularly to reflect the latest rankings.

Organization	Model Name	Elo score
	Seedance 1.0	1292
	Veo 3	1244
	Veo 2	1133
	Kling 2.0	1116
	Sora	1046
	Kling 1.5 (Pro)	1050
	T2V-01	1040
	Pika 2.0	1034
	Wan2.1-T2V-14B	1022
	Kling 1.6 (Pro)	1030
	T2V-01-Director	1020
	HunyuanVideo	1005
	Mochi-1	1000
	Gen-3 Alpha	987
	Kling 1.0	969
	Ray 1	969
	Ray 2	954
Haiper	Haiper 2.0	947
	Pika 1.5	943
	CogVideoX-5B	784

Video Generation Providers

Discover platforms that provide video generation services, enabling users to create video content from text or image prompts through cloud-based solutions.

Tool	Description	Licence	Pricing
Dream Machine	A groundbreaking text-to-video AI tool that enables users to generate high-quality, realistic video clips from simple text prompts in just minutes.
Elai	A video creation platform that enables users to produce videos by inputting text that is then narrated by AI-generated avatars.
Heygen	An innovative video platform that harnesses the power of generative AI to streamline the video creation process.
Higgsfield	A pioneering foundational model company that specializes in democratizing social media content creation through AI-powered video generation and editing tools.
Kling	An advanced video generation model developed by Kuaishou Technology, known for its capabilities in creating high-quality videos from text prompts.
Krea	An advanced AI-powered platform designed for generating and enhancing visual content, including images and videos.
Runway	An AI-powered platform for creatives to use machine learning models in their workflows.
Sora	An AI model developed by OpenAI for generating videos from textual descriptions.
Synthesia	A synthetic media generation AI tool to create AI-generated video content efficiently.
Veo	A generative video model developed by Google, capable of producing high-quality 1080p videos.
Vlogger	A method for text and audio-driven talking human video generation from a single input image of a person.
Wombo	An AI-powered mobile app for creating lip-syncing videos and other creative content.

3D Model Generation

Transform text descriptions and images into detailed 3D models using AI. These Models enable rapid prototyping, asset creation, and visualization by converting natural language or visual inputs into three-dimensional objects.

Text/Image-to-3D Models

Organization	Model Name	Licence	Pricing
	Hunyuan3D-2.1
	InstantMesh
	Stable-zero123
	TripoSR
	stable-fast-3d
craftsman3d	CraftsMan-v1-5
Ashawkey	LGM
Jade choghari	vfusion3d
Zhaoxi Chen	3DTopia-XL

⬆️ Back to Top

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Table of contents

Computer Vision

Image Editing

Image Editing Models

Cloud-based Image Editing Providers

Image Generation

Image Generation Models

Cloud-based Image Generation Providers

Local Image Generation Providers

Video Generation

Image-to-Video Models

Text-to-Video Models

Video Generation Providers

3D Model Generation

Text/Image-to-3D Models

FilesExpand file tree

Computer_Vision.md

Latest commit

History

Computer_Vision.md

File metadata and controls

Table of contents

Computer Vision

Image Editing

Image Editing Models

Cloud-based Image Editing Providers

Image Generation

Image Generation Models

Cloud-based Image Generation Providers

Local Image Generation Providers

Video Generation

Image-to-Video Models

Text-to-Video Models

Video Generation Providers

3D Model Generation

Text/Image-to-3D Models