Guides

How to use ControlNet

Learn about using ControlNet within your creations

What is ControlNet?

ControlNet is a Stable Diffusion model that enables you to transfer compositions or human poses from a reference image into a newly generated image.

With ControlNet, you have the ability to control the placement of subjects, their appearance, and even their poses. Unique images can be created by creatively combining ControlNet with the Guide Image Modules.

Setting Up ControlNet

To add a Guide Image, drag and drop an image or video from your computer or from within Sogni. Alternatively, you can use the folder button to access Finder and select your desired image.

Step One

Assign a reference image or video
Video preprocessing can be time-consuming. You can reduce the processing time by adjusting the animation duration and frame-rate settings to limit the number of imported frames.

Step Two

Adjust the image using a Preprocessor
There are six preprocessors to choose from, which are represented by the five buttons to the left of the Reference Image:

Face Capture

Locates all faces in the input image, analyzes each face to detect facial features, and generates a facial landmarks map that can be used in conjunction with the OpenPose CN-model. This provides you with the ability to transfer facial expressions from one image to another. The Face Capture and Pose Capture preprocessors can be used simultaneously.

Pose Capture

This module locates human subjects in the input image, analyzes each one to capture their pose, and generates a 'bones-joints' type map that can be used in combination with the OpenPose CN-model. This gives you the ability to control the pose of subjects in your generated images.

Sketch/Outline

Creates a sketch-style image from the reference image with the help of an AI model.

Depth Map

Utilizes the MiDaS Depth model to generate a depth-map from the reference image.

Segmentation

Utilizes the IS-Net and U2Net Segmentation models to generate a segmentation map from the reference image based on its distinct areas and subjects. When you select the Segmentation preprocessor, you also have the option to mask only subjects or backgrounds. (IS-Net is used for single images while U2Net is applied to video frames)

Invert Colors

Will invert the colors of the active image (original, depth-map, sketch).

Step Three

Select a ControlNet model
Sogni offers 14 different ControlNet models, each with different purposes and offering different results:

Canny

Works with the outlines of an image. Best if the reference image is an outline-style image.

Depth

Reproduces depth relationships from the reference image. Works well with original images and depth-maps.

InPaint

Utilize masks to define and modify specific areas. Employ reference images with clear transparency areas to be filled with the InPaint model. To zoom out an image and fill the empty areas, use InPaint. Just add the image to ControlNet, activate 'Camera: zoom, pan, roll', zoom out to your desired level, select the InPaint model, and click the "Imagine" button.

Instruct Pixel2Pixel

Capable of making direct changes to the reference image from text instructions. Use the prompt and style fields to instruct ControlNet to change something in the image. You can type things like "Make them look like robots", "Add boats on the water", "Change sunglasses for ski goggles", "Make it look like winter", etc.

LineArt

Find and reuse small outlines.

LineArt Anime

LineArt optimized for anime.

M-LSD

(Mobile Line Segment Detection) is a straight-line detector. It is useful for extracting outlines with straight edges like interior designs, buildings, street scenes, picture frames, and paper edges.

Normal Bae

Reproduce depth relationships using surface normal depth maps. The Bae normal map tends to render details in both background and foreground, while the Depth model might ignore more background objects.

OpenPose

A fast human key-point detection model that can extract human poses like positions of hands, legs, and head. Use it in conjunction with the face and body capture preprocessors.

Scribble

Generate images from scribbles or freehand drawings. Works great with the Sketch preprocessor.

Segmentation

Find and reuse distinct areas and subjects. Works well with the Segmentation and Depth preprocessors.

Shuffle

Find and reorder major elements.

SoftEdge

In contrast with the LineArt models, SoftEdge finds and reuses soft edges to generate the new image. Works great without preprocessing.

Tile Resample

Used for adding details to an image.

Need help? Ask in our Discord!

For any questions regarding the use of Sogni, tips, suggestions, bugs, sharing art, discussing AI and life, or simply hanging out, please feel free to join our Discord today!

Join Now