Creating Magic on Canvas: Essential Photoshop Brushes for Drawing

Featured Imgs 23

Importance of Photoshop Brushes in Drawing

In the realm of digital art, Photoshop brushes are the unsung heroes that bring illustrations to life. Whether you’re sketching, painting, or adding intricate details, the right brush can be a game-changer in achieving the desired effect.

Key Reasons to Value Photoshop Brushes :

  • Versatility : They cater to various styles, from realistic portraits to abstract compositions.
  • Control & Precision : Fine-tune your artwork with brushes that mimic traditional mediums like pencil, ink, or paint.
  • Time Efficiency : Quick adjustments and effects expedite the creative process without compromising quality

Overview of the Essential Brushes Covered

This article will delve into four essential categories of brushes that every aspiring artist should have in their arsenal:

  • Pencil Brushes : Perfect for sketching and detailed drawing.
  • Paint Brushes : Ideal for vibrant and rich paintings.
  • Texture Brushes : For adding depth and incredible details.
  • Special Effects Brushes : Tools for creating stunning lighting and atmosphere.

Get ready to transform your digital creations with these indispensable tools!

Pencil Brushes

Characteristics and Usage

When it comes to Photoshop brushes for drawing, pencil brushes are often the first choice for artists seeking to replicate the traditional pencil look. These brushes are designed to mimic the texture and feel of graphite on paper.

Characteristics of Pencil Brushes :

  • Pressure Sensitivity : They react dynamically to the pressure applied, allowing for varied line thickness.
  • Textured Variations : Many offer a range of textures, from smooth to rough, to reflect different pencil types.
  • Opacity Control : Users can easily adjust opacity to create subtle gradations.

Ideal for everything from quick sketches to detailed illustrations, pencil brushes are essential tools for digital artists.

Tips for Realistic Pencil Drawing

To elevate your digital pencil drawing, consider these practical tips:

  • Layering : Use multiple layers to build depth and dimension gradually.
  • Vary Brush Sizes : Combine different brush sizes for a mix of fine details and broader strokes.
  • Blend with Smudge Tool : Smooth transitions can be achieved using the smudge tool to mimic the pencil blending technique.

With these techniques, you’ll be on your way to creating stunning, realistic pencil drawings in Photoshop!

Paint Brushes

Different Types of Paint Brushes

Once you’ve immersed yourself in the world of pencil brushes, it’s time to explore the vibrant universe of paint brushes available in Photoshop. These brushes are designed to emulate a range of traditional painting styles.

Types of Paint Brushes :

  • Round Brushes : Perfect for detail work and fine lines.
  • Flat Brushes : Excellent for broad strokes, filling in large areas, or creating sharp edges.
  • Fan Brushes : Great for textures like foliage or creating unique patterns.
  • Dry Brush : Mimics the look of a dry paint application, adding grit and depth to your artwork.

Each type has its own unique effect, allowing artists to express their creativity in various ways.

Techniques for Painting in Photoshop

To master painting in Photoshop, consider adopting these effective techniques :

  • Layering : Build colors gradually by working in multiple layers for depth and richness.
  • Blending : Use a soft round brush or smudge tool to seamlessly blend colors.
  • Wet Brush Techniques : Explore brushes that simulate wet paint effects for fluid transitions.

With these skills in your toolkit, you’ll bring your digital canvas to life, one brushstroke at a time!

Texture Brushes

Adding Depth and Detail

As we venture further into the world of Photoshop brushes for digital art, texture brushes stand out as vital tools for artists keen on enhancing their work with depth and detail. These brushes are specially crafted to introduce intricate layers that mimic the varied surfaces of traditional art.

Benefits of Using Texture Brushes :

  • Realism : They can add a tactile quality to illustrations, making them feel more authentic.
  • Complexity : Introduce visual interest by layering textures, transforming flat designs into captivating pieces.
  • Flexibility : Adaptable for all styles, from hyper-realistic portraits to abstract designs.

Texture brushes allow artists to breathe life into their creations by cleverly manipulating the feel and look of surfaces.

Creating Unique Effects with Texture Brushes

To really make your artwork pop, try these innovative techniques with texture brushes :

  • Overlay Techniques : Use texture brushes as overlays to create backgrounds or subtle patterns.
  • Color Variance : Experiment with different colors on textures to define lighting and shadow effects.
  • Blend with Original Shapes : Combine textures with existing shapes for a cohesive design that feels organic.

By incorporating these strategies, you’ll unlock a treasure trove of unique effects that elevate your digital art to new heights!

Special Effects Brushes

Light and Shading Brushes

Transitioning from texture to special effects brushes, we dive into the realm of light and shading brushes that have the power to transform your artwork instantly. These brushes are designed to mimic various sources of light, creating stunning highlights and deep shadows.

Characteristics of Light and Shading Brushes :

  • Dynamic Opacity : Adjusts with pressure, allowing for soft transitions from light to dark.
  • Unique Shapes : Custom shapes that reflect different light effects, such as glows, streaks, or diffused light.
  • Blendability : Easily blended with other colors to achieve a natural look.

These brushes can define your art, making it come alive with emotion and depth.

Creating Atmosphere and Mood

To establish mood and atmosphere in your compositions, consider these effective techniques:

  • Layering Light : Use multiple layers of light brushes to simulate different times of day and create drama.
  • Varying Color palettes : Integrate complementary colors to produce mood shifts, like warmth for a cozy scene or cooler tones for a melancholic vibe.
  • Highlighting Focal Points : Draw attention by using high-contrast light around the central subject.

By mastering these techniques, you will convey powerful stories and emotions through your digital art, captivating your audience with every brushstroke!

Conclusion

Summary of Essential Photoshop Brushes

As we’ve explored throughout this article, Photoshop brushes are indispensable tools for any digital artist. From sketching to painting, texturing and applying special effects, the right brushes elevate your artwork from ordinary to extraordinary.

The Essential Brushes Covered:

  • Pencil Brushes : Ideal for sketching and fine detailing.
  • Paint Brushes : Perfect for vibrant, fluid painting techniques.
  • Texture Brushes : Great for adding depth and intricate details.
  • Special Effects Brushes : Essential for creating dynamic lighting and mood.

Each brush type plays a vital role in enhancing your overall artistic expression.

Free Pencil Brushes for Photoshop

Free Pencil Brushes for Photoshop

Download

Free Marker Pen Brushes

Free Marker Pen Brushes

Download

Graphite Pencil Photoshop Brushes

Graphite Pencil Photoshop Brushes

Download

Photoshop Illustration Brushes

Photoshop Illustration Brushes

Download

Charcoal Pencil Brushes

Charcoal Pencil Brushes

Download

Real Marker Brushes

Real Marker Brushes

Download

Pencil Brushes

Pencil Brushes

Download

Sketch Brushes

Sketch Brushes

Download

Pencil Brushes for Photoshop

Pencil Brushes for Photoshop

Download

Free Paint Stroke Brushes

Free Paint Stroke Brushes

Download

Essential Illustration Brushes

Essential Illustration Brushes

Download

Pencil Brushes

Pencil Brushes

Download

Free Textures Brushes

Free Textures Brushes

Download

Free Subtle Texture Brushes

Free Subtle Texture Brushes

Download

Digital Paint Photoshop Brushes

Digital Paint Photoshop Brushes

Download

Photoshop Paint Brushes

Photoshop Paint Brushes

Download

Fiber Brushes

Fiber Brushes

Download

Canvas Brushes

Canvas Brushes

Download

Digital Painting Photoshop Brush

Digital Painting Photoshop Brush

Download

Pencil Scribbles Photoshop Brushes

Pencil Scribbles Photoshop Brushes

Download

Painting Artistic Brushes

Painting Artistic Brushes

Download

Dry Markers

Dry Markers

Download

Paint Stroke Brushes

Paint Stroke Brushes

Download

Grunge Texture Brushes For Photoshop

Grunge Texture Brushes For Photoshop

Download

Jess’s Acrylic Texture Brushes

Jess's Acrylic Texture Brushes

Download

Black and White HD Pencil Brushes

Black and White HD Pencil Brushes

Download

High Res Dry Brush Stroke Photoshop Brushes

High Res Dry Brush Stroke Photoshop Brushes

Download

Noise Texture Photoshop Brushes

Noise Texture Photoshop Brushes

Download

Inky Photoshop Brushes

Inky Photoshop Brushes

Download

Ink Brushes for Photoshop

Ink Brushes for Photoshop

Download

The post Creating Magic on Canvas: Essential Photoshop Brushes for Drawing appeared first on CSS Author.

Integrating Image-To-Text And Text-To-Speech Models (Part 1)

Category Image 073

Audio descriptions involve narrating contextual visual information in images or videos, improving user experiences, especially for those who rely on audio cues.

At the core of audio description technology are two crucial components: the description and the audio. The description involves understanding and interpreting the visual content of an image or video, which includes details such as actions, settings, expressions, and any other relevant visual information. Meanwhile, the audio component converts these descriptions into spoken words that are clear, coherent, and natural-sounding.

So, here’s something we can do: build an app that generates and announces audio descriptions. The app can integrate a pre-trained vision-language model to analyze image inputs, extract relevant information, and generate accurate descriptions. These descriptions are then converted into speech using text-to-speech technology, providing a seamless and engaging audio experience.

By the end of this tutorial, you will gain a solid grasp of the components that are used to build audio description tools. We’ll spend time discussing what VLM and TTS models are, as well as many examples of them and tooling for integrating them into your work.

When we finish, you will be ready to follow along with a second tutorial in which we level up and build a chatbot assistant that you can interact with to get more insights about your images or videos.

Vision-Language Models: An Introduction

VLMs are a form of artificial intelligence that can understand and learn from visuals and linguistic modalities.

They are trained on vast amounts of data that include images, videos, and text, allowing them to learn patterns and relationships between these modalities. In simple terms, a VLM can look at an image or video and generate a corresponding text description that accurately matches the visual content.

VLMs typically consist of three main components:

  1. An image model that extracts meaningful visual information,
  2. A text model that processes and understands natural language,
  3. A fusion mechanism that combines the representations learned by the image and text models, enabling cross-modal interactions.

Generally speaking, the image model — also known as the vision encoder — extracts visual features from input images and maps them to the language model’s input space, creating visual tokens. The text model then processes and understands natural language by generating text embeddings. Lastly, these visual and textual representations are combined through the fusion mechanism, allowing the model to integrate visual and textual information.

VLMs bring a new level of intelligence to applications by bridging visual and linguistic understanding. Here are some of the applications where VLMs shine:

  • Image captions: VLMs can provide automatic descriptions that enrich user experiences, improve searchability, and even enhance visuals for vision impairments.
  • Visual answers to questions: VLMs could be integrated into educational tools to help students learn more deeply by allowing them to ask questions about visuals they encounter in learning materials, such as complex diagrams and illustrations.
  • Document analysis: VLMs can streamline document review processes, identifying critical information in contracts, reports, or patents much faster than reviewing them manually.
  • Image search: VLMs could open up the ability to perform reverse image searches. For example, an e-commerce site might allow users to upload image files that are processed to identify similar products that are available for purchase.
  • Content moderation: Social media platforms could benefit from VLMs by identifying and removing harmful or sensitive content automatically before publishing it.
  • Robotics: In industrial settings, robots equipped with VLMs can perform quality control tasks by understanding visual cues and describing defects accurately.

This is merely an overview of what VLMs are and the pieces that come together to generate audio descriptions. To get a clearer idea of how VLMs work, let’s look at a few real-world examples that leverage VLM processes.

VLM Examples

Based on the use cases we covered alone, you can probably imagine that VLMs come in many forms, each with its unique strengths and applications. In this section, we will look at a few examples of VLMs that can be used for a variety of different purposes.

IDEFICS

IDEFICS is an open-access model inspired by Deepmind’s Flamingo, designed to understand and generate text from images and text inputs. It’s similar to OpenAI’s GPT-4 model in its multimodal capabilities but is built entirely from publicly available data and models.

IDEFICS is trained on public data and models — like LLama V1 and Open Clip — and comes in two versions: the base and instructed versions, each available in 9 billion and 80 billion parameter sizes.

The model combines two pre-trained unimodal models (for vision and language) with newly added Transformer blocks that allow it to bridge the gap between understanding images and text. It’s trained on a mix of image-text pairs and multimodal web documents, enabling it to handle a wide range of visual and linguistic tasks. As a result, IDEFICS can answer questions about images, provide detailed descriptions of visual content, generate stories based on a series of images, and function as a pure language model when no visual input is provided.

PaliGemma

PaliGemma is an advanced VLM that draws inspiration from PaLI-3 and leverages open-source components like the SigLIP vision model and the Gemma language model.

Designed to process both images and textual input, PaliGemma excels at generating descriptive text in multiple languages. Its capabilities extend to a variety of tasks, including image captioning, answering questions from visuals, reading text, detecting subjects in images, and segmenting objects displayed in images.

The core architecture of PaliGemma includes a Transformer decoder paired with a Vision Transformer image encoder that boasts an impressive 3 billion parameters. The text decoder is derived from Gemma-2B, while the image encoder is based on SigLIP-So400m/14.

Through training methods similar to PaLI-3, PaliGemma achieves exceptional performance across numerous vision-language challenges.

PaliGemma is offered in two distinct sets:

  • General Purpose Models (PaliGemma): These pre-trained models are designed for fine-tuning a wide array of tasks, making them ideal for practical applications.
  • Research-Oriented Models (PaliGemma-FT): Fine-tuned on specific research datasets, these models are tailored for deep research on a range of topics.

Phi-3-Vision-128K-Instruct

The Phi-3-Vision-128K-Instruct model is a Microsoft-backed venture that combines text and vision capabilities. It’s built on a dataset of high-quality, reasoning-dense data from both text and visual sources. Part of the Phi-3 family, the model has a context length of 128K, making it suitable for a range of applications.

You might decide to use Phi-3-Vision-128K-Instruct in cases where your application has limited memory and computing power, thanks to its relatively lightweight that helps with latency. The model works best for generally understanding images, recognizing characters in text, and describing charts and tables.

Yi Vision Language (Yi-VL)

Yi-VL is an open-source AI model developed by 01-ai that can have multi-round conversations with images by reading text from images and translating it. This model is part of the Yi LLM series and has two versions: 6B and 34B.

What distinguishes Yi-VL from other models is its ability to carry a conversation, whereas other models are typically limited to a single text input. Plus, it’s bilingual making it more versatile in a variety of language contexts.

Finding And Evaluating VLMs

There are many, many VLMs and we only looked at a few of the most notable offerings. As you commence work on an application with image-to-text capabilities, you may find yourself wondering where to look for VLM options and how to compare them.

There are two resources in the Hugging Face community you might consider using to help you find and compare VLMs. I use these regularly and find them incredibly useful in my work.

Vision Arena

Vision Arena is a leaderboard that ranks VLMs based on anonymous user voting and reviews. But what makes it great is the fact that you can compare any two models side-by-side for yourself to find the best fit for your application.

And when you compare two models, you can contribute your own anonymous votes and reviews for others to lean on as well.

OpenVLM Leaderboard

OpenVLM is another leaderboard hosted on Hugging Face for getting technical specs on different models. What I like about this resource is the wealth of metrics for evaluating VLMs, including the speed and accuracy of a given VLM.

Further, OpenVLM lets you filter models by size, type of license, and other ranking criteria. I find it particularly useful for finding VLMs I might have overlooked or new ones I haven’t seen yet.

Text-To-Speech Technology

Earlier, I mentioned that the app we are about to build will use vision-language models to generate written descriptions of images, which are then read aloud. The technology that handles converting text to audio speech is known as text-to-speech synthesis or simply text-to-speech (TTS).

TTS converts written text into synthesized speech that sounds natural. The goal is to take published content, like a blog post, and read it out loud in a realistic-sounding human voice.

So, how does TTS work? First, it breaks down text into the smallest units of sound, called phonemes, and this process allows the system to figure out proper word pronunciations. Next, AI enters the mix, including deep learning algorithms trained on hours of human speech data. This is how we get the app to mimic human speech patterns, tones, and rhythms — all the things that make for “natural” speech. The AI component is key as it elevates a voice from robotic to something with personality. Finally, the system combines the phoneme information with the AI-powered digital voice to render the fully expressive speech output.

The result is automatically generated speech that sounds fairly smooth and natural. Modern TTS systems are extremely advanced in that they can replicate different tones and voice inflections, work across languages, and understand context. This naturalness makes TTS ideal for humanizing interactions with technology, like having your device read text messages out loud to you, just like Apple’s Siri or Microsoft’s Cortana.

TTS Examples

Based on the use cases we covered alone, you can probably imagine that VLMs come in many forms, each with its unique strengths and applications. In this section, we will look at a few examples of VLMs that can be used for a variety of different purposes.

Just as we took a moment to review existing vision language models, let’s pause to consider some of the more popular TTS resources that are available.

Bark

Straight from Bark’s model card in Hugging Face:

“Bark is a transformer-based text-to-audio model created by Suno. Bark can generate highly realistic, multilingual speech as well as other audio — including music, background noise, and simple sound effects. The model can also produce nonverbal communication, like laughing, sighing, and crying. To support the research community, we are providing access to pre-trained model checkpoints ready for inference.”

The non-verbal communication cues are particularly interesting and a distinguishing feature of Bark. Check out the various things Bark can do to communicate emotion, pulled directly from the model’s GitHub repo:

  • [laughter]
  • [laughs]
  • [sighs]
  • [music]
  • [gasps]
  • [clears throat]

This could be cool or creepy, depending on how it’s used, but reflects the sophistication we’re working with. In addition to laughing and gasping, Bark is different in that it doesn’t work with phonemes like a typical TTS model:

“It is not a conventional TTS model but instead a fully generative text-to-audio model capable of deviating in unexpected ways from any given script. Different from previous approaches, the input text prompt is converted directly to audio without the intermediate use of phonemes. It can, therefore, generalize to arbitrary instructions beyond speech, such as music lyrics, sound effects, or other non-speech sounds.”

Coqui

Coqui/XTTS-v2 can clone voices in different languages. All it needs for training is a short six-second clip of audio. This means the model can be used to translate audio snippets from one language into another while maintaining the same voice.

At the time of writing, Coqui currently supports 16 languages, including English, Spanish, French, German, Italian, Portuguese, Polish, Turkish, Russian, Dutch, Czech, Arabic, Chinese, Japanese, Hungarian, and Korean.

Parler-TTS

Parler-TTS excels at generating high-quality, natural-sounding speech in the style of a given speaker. In other words, it replicates a person’s voice. This is where many folks might draw an ethical line because techniques like this can be used to essentially imitate a real person, even without their consent, in a process known as “deepfake” and the consequences can range from benign impersonations to full-on phishing attacks.

But that’s not really the aim of Parler-TTS. Rather, it’s good in contexts that require personalized and natural-sounding speech generation, such as voice assistants and possibly even accessibility tooling to aid visual impairments by announcing content.

TTS Arena Leaderboard

Do you know how I shared the OpenVLM Leaderboard for finding and comparing vision language models? Well, there’s an equivalent leadership for TTS models as well over at the Hugging Face community called TTS Arena.

TTS models are ranked by the “naturalness” of their voices, with the most natural-sounding models ranked first. Developers like you and me vote and provide feedback that influences the rankings.

TTS API Providers

What we just looked at are TTS models that are baked into whatever app we’re making. However, some models are consumable via API, so it’s possible to get the benefits of a TTS model without the added bloat if a particular model is made available by an API provider.

Whether you decide to bundle TTS models in your app or integrate them via APIs is totally up to you. There is no right answer as far as saying one method is better than another — it’s more about the app’s requirements and whether the dependability of a baked-in model is worth the memory hit or vice-versa.

All that being said, I want to call out a handful of TTS API providers for you to keep in your back pocket.

ElevenLabs

ElevenLabs offers a TTS API that uses neural networks to make voices sound natural. Voices can be customized for different languages and accents, leading to realistic, engaging voices.

Try the model out for yourself on the ElevenLabs site. You can enter a block of text and choose from a wide variety of voices that read the submitted text aloud.

Colossyan

Colossyan’s text-to-speech API converts text into natural-sounding voice recordings in over 70 languages and accents. From there, the service allows you to match the audio to an avatar to produce something like a complete virtual presentation based on your voice — or someone else’s.

Once again, this is encroaching on deepfake territory, but it’s really interesting to think of Colossyan’s service as a virtual casting call for actors to perform off a script.

Murf.ai

Murf.ai is yet another TTS API designed to generate voiceovers based on real human voices. The service provides a slew of premade voices you can use to generate audio for anything from explainer videos and audiobooks to course lectures and entire podcast episodes.

Amazon Polly

Amazon has its own TTS API called Polly. You can customize the voices using lexicons and Speech Synthesis Markup (SSML) tags for establishing speaking styles with affordances for adjusting things like pitch, speed, and volume.

PlayHT

The PlayHT TTS API generates speech in 142 languages. Type what you want it to say, pick a voice, and download the output as an MP3 or WAV file.

Demo: Building An Image-to-Audio Interface

So far, we have discussed the two primary components for generating audio from text: vision-language models and text-to-speech models. We’ve covered what they are, where they fit into the process of generating real-sounding speech, and various examples of each model.

Now, it’s time to apply those concepts to the app we are building in this tutorial (and will improve in a second tutorial). We will use a VLM so the app can glean meaning and context from images, a TTS model to generate speech that mimics a human voice, and then integrate our work into a user interface for submitting images that will lead to generated speech output.

I have decided to base our work on a VLM by Salesforce called BLIP, a TTS model from Kakao Enterprise called VITS, and Gradio as a framework for the design interface. I’ve covered Gradio extensively in other articles, but the gist is that it is a Python library for building web interfaces — only it offers built-in tools for working with machine learning models that make Gradio ideal for a tutorial like this.

You can use completely different models if you like. The whole point is less about the intricacies of a particular model than it is to demonstrate how the pieces generally come together.

Oh, and one more detail worth noting: I am working with the code for all of this in Google Collab. I’m using it because it’s hosted and ideal for demonstrations like this. But you can certainly work in a more traditional IDE, like VS Code.

Installing Libraries

First, we need to install the necessary libraries:

#python
!pip install gradio pillow transformers scipy numpy

We can upgrade the transformers library to the latest version if we need to:

#python
!pip install --upgrade transformers

Not sure if you need to upgrade? Here’s how to check the current version:

#python
import transformers
print(transformers.__version__)

OK, now we are ready to import the libraries:

#python
import gradio as gr
from PIL import Image
from transformers import pipeline
import scipy.io.wavfile as wavfile
import numpy as np

These libraries will help us process images, use models on the Hugging Face hub, handle audio files, and build the UI.

Creating Pipelines

Since we will pull our models directly from Hugging Face’s model hub, we can tap into them using pipelines. This way, we’re working with an API for tasks that involve natural language processing and computer vision without carrying the load in the app itself.

We set up our pipeline like this:

#python
caption_image = pipeline("image-to-text", model="Salesforce/blip-image-captioning-large")

This establishes a pipeline for us to access BLIP for converting images into textual descriptions. Again, you could establish a pipeline for any other model in the Hugging Face hub.

We’ll need a pipeline connected to our TTS model as well:

#python
Narrator = pipeline("text-to-speech", model="kakao-enterprise/vits-ljs")

Now, we have a pipeline where we can pass our image text to be converted into natural-sounding speech.

Converting Text to Speech

What we need now is a function that handles the audio conversion. Your code will differ depending on the TTS model in use, but here is how I approached the conversion based on the VITS model:

#python

def generate_audio(text):
  # Generate speech from the input text using the Narrator (VITS model)
  Narrated_Text = Narrator(text)

  # Extract the audio data and sampling rate
  audio_data = np.array(Narrated_Text["audio"][0])
  sampling_rate = Narrated_Text["sampling_rate"]

  # Save the generated speech as a WAV file
  wavfile.write("generated_audio.wav", rate=sampling_rate, data=audio_data)

  # Return the filename of the saved audio file
  return "generated_audio.wav"

That’s great, but we need to make sure there’s a bridge that connects the text that the app generates from an image to the speech conversion. We can write a function that uses BLIP to generate the text and then calls the generate_audio() function we just defined:

#python
def caption_my_image(pil_image):
  # Use BLIP to generate a text description of the input image
  semantics = caption_image(images=pil_image)[0]["generated_text"]

  # Generate audio from the text description
  return generate_audio(semantics)

Building The User Interface

Our app would be pretty useless if there was no way to interact with it. This is where Gradio comes in. We will use it to create a form that accepts an image file as an input and then outputs the generated text for display as well as the corresponding file containing the speech.

#python

main_tab = gr.Interface(
  fn=caption_my_image,
  inputs=[gr.Image(label="Select Image", type="pil")],
  outputs=[gr.Audio(label="Generated Audio")],
  title=" Image Audio Description App",
  description="This application provides audio descriptions for images."
)

# Information tab
info_tab = gr.Markdown("""
  # Image Audio Description App
  ### Purpose
  This application is designed to assist visually impaired users by providing audio descriptions of images. It can also be used in various scenarios such as creating audio captions for educational materials, enhancing accessibility for digital content, and more.

  ### Limits
  - The quality of the description depends on the image clarity and content.
  - The application might not work well with images that have complex scenes or unclear subjects.
  - Audio generation time may vary depending on the input image size and content.
  ### Note
  - Ensure the uploaded image is clear and well-defined for the best results.
  - This app is a prototype and may have limitations in real-world applications.
""")

# Combine both tabs into a single app 
 demo = gr.TabbedInterface(
  [main_tab, info_tab],
  tab_names=["Main", "Information"]
)

demo.launch()

The interface is quite plain and simple, but that’s OK since our work is purely for demonstration purposes. You can always add to this for your own needs. The important thing is that you now have a working application you can interact with.

At this point, you could run the app and try it in Google Collab. You also have the option to deploy your app, though you’ll need hosting for it. Hugging Face also has a feature called Spaces that you can use to deploy your work and run it without Google Collab. There’s even a guide you can use to set up your own Space.

Here’s the final app that you can try by uploading your own photo:

Coming Up…

We covered a lot of ground in this tutorial! In addition to learning about VLMs and TTS models at a high level, we looked at different examples of them and then covered how to find and compare models.

But the rubber really met the road when we started work on our app. Together, we made a useful tool that generates text from an image file and then sends that text to a TTS model to convert it into speech that is announced out loud and downloadable as either an MP3 or WAV file.

But we’re not done just yet! What if we could glean even more detailed information from images and our app not only describes the images but can also carry on a conversation about them?

Sounds exciting, right? This is exactly what we’ll do in the second part of this tutorial.

Getting To The Bottom Of Minimum WCAG-Conformant Interactive Element Size

Fotolia Subscription Monthly 4685447 Xl Stock

There are many rumors and misconceptions about conforming to WCAG criteria for the minimum sizing of interactive elements. I’d like to use this post to demystify what is needed for baseline compliance and to point out an approach for making successful and inclusive interactive experiences using ample target sizes.

Minimum Conformant Pixel Size

Getting right to it: When it comes to pure Web Content Accessibility Guidelines (WCAG) conformance, the bare minimum pixel size for an interactive, non-inline element is 24×24 pixels. This is outlined in Success Criterion 2.5.8: Target Size (Minimum).

Success Criterion 2.5.8 is level AA, which is the most commonly used level for public, mass-consumed websites. This Success Criterion (or SC for short) is sometimes confused for SC 2.5.5 Target Size (Enhanced), which is level AAA. The two are distinct and provide separate guidance for properly sizing interactive elements, even if they appear similar at first glance.

SC 2.5.8 is relatively new to WCAG, having been released as part of WCAG version 2.2, which was published on October 5th, 2023. WCAG 2.2 is the most current version of the standard, but this newer release date means that knowledge of its existence isn’t as widespread as the older SC, especially outside of web accessibility circles. That said, WCAG 2.2 will remain the standard until WCAG 3.0 is released, something that is likely going to take 10–15 years or more to happen.

SC 2.5.5 calls for larger interactive elements sizes that are at least 44×44 pixels (compared to the SC 2.5.8 requirement of 24×24 pixels). At the same time, notice that SC 2.5.5 is level AAA (compared to SC 2.5.8, level AA) which is a level reserved for specialized support beyond level AA.

Sites that need to be fully WCAG Level AAA conformant are rare. Chances are that if you are making a website or web app, you’ll only need to support level AA. Level AAA is often reserved for large or highly specialized institutions.

Making Interactive Elements Larger With CSS Padding

The family of padding-related properties in CSS can be used to extend the interactive area of an element to make it conformant. For example, declaring padding: 4px; on an element that measures 16×16 pixels invisibly increases its bounding box to a total of 24×24 pixels. This, in turn, means the interactive element satisfies SC 2.5.8.

This is a good trick for making smaller interactive elements easier to click and tap. If you want more information about this sort of thing, I enthusiastically recommend Ahmad Shadeed’s post, “Designing better target sizes”.

I think it’s also worth noting that CSS margin could also hypothetically be used to achieve level AA conformance since the SC includes a spacing exception:

The size of the target for pointer inputs is at least 24×24 CSS pixels, except where:

Spacing: Undersized targets (those less than 24×24 CSS pixels) are positioned so that if a 24 CSS pixel diameter circle is centered on the bounding box of each, the circles do not intersect another target or the circle for another undersized target;

[…]

The difference here is that padding extends the interactive area, while margin does not. Through this lens, you’ll want to honor the spirit of the success criterion because partial conformance is adversarial conformance. At the end of the day, we want to help people successfully click or tap interactive elements, such as buttons.

What About Inline Interactive Elements?

We tend to think of targets in terms of block elements — elements that are displayed on their own line, such as a button at the end of a call-to-action. However, interactive elements can be inline elements as well. Think of links in a paragraph of text.

Inline interactive elements, such as text links in paragraphs, do not need to meet the 24×24 pixel minimum requirement. Just as margin is an exception in SC 2.5.8: Target Size (Minimum), so are inline elements with an interactive target:

The size of the target for pointer inputs is at least 24×24 CSS pixels, except where:

[…]

Inline: The target is in a sentence or its size is otherwise constrained×the line-height of non-target text;

[…]

Apple And Android: The Source Of More Confusion

If the differences between interactive elements that are inline and block are still confusing, that’s probably because the whole situation is even further muddied by third-party human interface guidelines requiring interactive sizes closer to what the level AAA Success Criterion 2.5.5 Target Size (Enhanced) demands.

For example, Apple’s “Human Interface Guidelines” and Google’s “Material Design” are guidelines for how to design interfaces for their respective platforms. Apple’s guidelines recommend that interactive elements are 44×44 points, whereas Google’s guides stipulate target sizes that are at least 48×48 using density-independent pixels.

These may satisfy Apple and Google requirements for designing interfaces, but are they WCAG-conformant Apple and Google — not to mention any other organization with UI guidelines — can specify whatever interface requirements they want, but are they copasetic with WCAG SC 2.5.5 and SC 2.5.8?

It’s important to ask this question because there is a hierarchy when it comes to accessibility compliance, and it contains legal levels:

Human interface guidelines often inform design systems, which, in turn, influence the sites and apps that are built by authors like us. But they’re not the “authority” on accessibility compliance. Notice how everything is (and ought to be) influenced by WCAG at the very top of the chain.

Even if these third-party interface guidelines conform to SC 2.5.5 and 2.5.8, it’s still tough to tell when they are expressed in “points” and “density independent pixels” which aren’t pixels, but often get conflated as such. I’d advise not getting too deep into researching what a pixel truly is-pixel%3F). Trust me when I say it’s a road you don’t want to go down. But whatever the case, the inconsistent use of unit sizes exacerbates the issue.

Can’t We Just Use A Media Query?

I’ve also observed some developers attempting to use the pointer media feature as a clever “trick” to detect when a touchscreen is present, then conditionally adjust an interactive element’s size as a way to get around the WCAG requirement.

After all, mouse cursors are for fine movements, and touchscreens are for more broad gestures, right? Not always. The thing is, devices are multimodal. They can support many different kinds of input and don’t require a special switch to flip or button to press to do so. A straightforward example of this is switching between a trackpad and a keyboard while you browse the web. A less considered example is a device with a touchscreen that also supports a trackpad, keyboard, mouse, and voice input.

You might think that the combination of trackpad, keyboard, mouse, and voice inputs sounds like some sort of absurd, obscure Frankencomputer, but what I just described is a Microsoft Surface laptop, and guess what? They’re pretty popular.

Responsive Design Vs. Inclusive Design

There is a difference between the two, even though they are often used interchangeably. Let’s delineate the two as clearly as possible:

  • Responsive Design is about designing for an unknown device.
  • Inclusive Design is about designing for an unknown user.

The other end of this consideration is that people with motor control conditions — like hand tremors or arthritis — can and do use mice inputs. This means that fine input actions may be painful and difficult, yet ultimately still possible to perform.

People also use more precise input mechanisms for touchscreens all the time, including both official accessories and aftermarket devices. In other words, some devices designed to accommodate coarse input can also be used for fine detail work.

I’d be remiss if I didn’t also point out that people plug mice and keyboards into smartphones. We cannot automatically say that they only support coarse pointers:

Context Is King

Conformant and successful interactive areas — both large and small — require knowing the ultimate goals of your website or web app. When you arm yourself with this context, you are empowered to make informed decisions about the kinds of people who use your service, why they use the service, and how you can accommodate them.

For example, the Glow Baby app uses larger interactive elements because it knows the user is likely holding an adorable, albeit squirmy and fussy, baby while using the application. This allows Glow Baby to emphasize the interactive targets in the interface to accommodate parents who have their hands full.

In the same vein, SC SC 2.5.8 acknowledges that smaller touch targets — such as those used in map apps — may contextually be exempt:

For example, in digital maps, the position of pins is analogous to the position of places shown on the map. If there are many pins close together, the spacing between pins and neighboring pins will often be below 24 CSS pixels. It is essential to show the pins at the correct map location; therefore, the Essential exception applies.

[…]

When the "Essential" exception is applicable, authors are strongly encouraged to provide equivalent functionality through alternative means to the extent practical.

Note that this exemption language is not carte blanche to make your own work an exception to the rule. It is more of a mechanism, and an acknowledgment that broadly applied rules may have exceptions that are worth thinking through and documenting for future reference.

Further Considerations

We also want to consider the larger context of the device itself as well as the environment the device will be used in.

Larger, more fixed position touchscreens compel larger interactive areas. Smaller devices that are moved around in space a lot (e.g., smartwatches) may benefit from alternate input mechanisms such as voice commands.

What about people who are driving in a car? People in this context probably ought to be provided straightforward, simple interactions that are facilitated via large interactive areas to prevent them from taking their eyes off the road. The same could also be said for high-stress environments like hospitals and oil rigs.

Similarly, devices and apps that are designed for children may require interactive areas that are larger than WCAG requirements for interactive areas. So would experiences aimed at older demographics, where age-derived vision and motor control disability factors tend to be more present.

Minimum conformant interactive area experiences may also make sense in their own contexts. Data-rich, information-dense experiences like the Bloomberg terminal come to mind here.

Design Systems Are Also Worth Noting

While you can control what components you include in a design system, you cannot control where and how they’ll be used by those who adopt and use that design system. Because of this, I suggest defensively baking accessible defaults into your design systems because they can go a long way toward incorporating accessible practices when they’re integrated right out of the box.

One option worth consideration is providing an accessible range of choices. Components, like buttons, can have size variants (e.g., small, medium, and large), and you can provide a minimally conformant interactive target on the smallest variant and then offer larger, equally conformant versions.

So, How Do We Know When We’re Good?

There is no magic number or formula to get you that perfect Goldilocks “not too small, not too large, but just right” interactive area size. It requires knowledge of what the people who want to use your service want, and how they go about getting it.

The best way to learn that? Ask people.

Accessibility research includes more than just asking people who use screen readers what they think. It’s also a lot easier to conduct than you might think! For example, prototypes are a great way to quickly and inexpensively evaluate and de-risk your ideas before committing to writing production code. “Conducting Accessibility Research In An Inaccessible Ecosystem” by Dr. Michele A. Williams is chock full of tips, strategies, and resources you can use to help you get started with accessibility research.

Wrapping Up

The bottom line is that

“Compliant” does not always equate to “usable.” But compliance does help set baseline requirements that benefit everyone.

To sum things up:

  • 24×24 pixels is the bare minimum in terms of WCAG conformance.
  • Inline interactive elements, such as links placed in paragraphs, are exempt.
  • 44×44 pixels is for WCAG level AAA support, and level AAA is reserved for specialized experiences.
  • Human interface guidelines by the likes of Apple, Android, and other companies must ultimately confirm to WCAG.
  • Devices are multimodal and can use different kinds of input concurrently.
  • Baking sensible accessible defaults into design systems can go a long way to ensuring widespread compliance.
  • Larger interactive element sizes may be helpful in many situations, but might not be recognized as an interactive element if they are too large.
  • User research can help you learn about your audience.

And, perhaps most importantly, all of this is about people and enabling them to get what they need.

Further Reading

Build Design Systems With Penpot Components

Typography Definitions Cover

This article is a sponsored by Penpot

If you’ve been following along with our Penpot series, you’re already familiar with this exciting open-source design tool and how it is changing the game for designer-developer collaboration. Previously, we’ve explored Penpot’s Flex Layout and Grid Layout features, which bring the power of CSS directly into the hands of designers.

Today, we’re diving into another crucial aspect of modern web design and development: components. This feature is a part of Penpot’s major 2.0 release, which introduces a host of new capabilities to bridge the gap between design and code further. Let’s explore how Penpot’s implementation of components can supercharge your design workflow and foster even better collaboration across teams.

About Components

Components are reusable building blocks that form the foundation of modern user interfaces. They encapsulate a piece of UI or functionality that can be reused across your application. This concept of composability — building complex systems from smaller, reusable parts — is a cornerstone of modern web development.

Why does composability matter? There are several key benefits:

  • Single source of truth
    Changes to a component are reflected everywhere it’s used, ensuring consistency.
  • Flexibility with simpler dependencies
    Components can be easily swapped or updated without affecting the entire system.
  • Easier maintenance and scalability
    As your system grows, components help manage complexity.

In the realm of design, this philosophy is best expressed in the concept of design systems. When done right, design systems help to bring your design and code together, reducing ambiguity and streamlining the processes.

However, that’s not so easy to achieve when your designs are built using logic and standards that are much different from the code they’re related to. Penpot works to solve this challenge through its unique approach. Instead of building visual artifacts that only mimic real-world interfaces, UIs in Penpots are built using the same technologies and standards as real working products.

This gives us much better parity between the media and allows designers to build interfaces that are already expressed as code. It fosters easier collaboration as designers and developers can speak the same language when discussing their components. The final result is more maintainable, too. Changes created by designers can propagate consistently, making it easier to manage large-scale systems.

Now, let’s take a look at how components in Penpot work in practice! As an example, I’m going to use the following fictional product page and recreate it in Penpot:

Components In Penpot

Creating Components

To create a component in Penpot, simply select the objects you want to include and select “Create component” from the context menu. This transforms your selection into a reusable element.

Creating Component Variants

Penpot allows you to create variants of your components. These are alternative versions that share the same basic structure but differ in specific aspects like color, size, or state.

You can create variants by using slashes (/) in the components name, for example, by naming your buttons Button/primary and Button/secondary. This will allow you to easily switch between types of a Button component later.

Nesting Components And Using External Libraries

Components in Penpot can be nested, allowing you to build complex UI elements from simpler parts. This mirrors how developers often structure their code. In other words, you can place components inside one another.

Moreover, the components you use don’t have to come from the same file or even from the same organization. You can easily share libraries of components across projects just as you would import code from various dependencies into your codebase. You can also import components from external libraries, such as UI kits and icon sets. Penpot maintains a growing list of such resources for you to choose from, including everything from the large design systems like Material Design to the most popular icon libraries.

Organizing Your Design System

The new major release of Penpot comes with a redesigned Assets panel, which is where your components live. In the Assets panel, you can easily access your components and drag and drop them into designs.

For the better maintenance of design systems, Penpot allows you to store your colors and typography as reusable styles. Same as components, you can name your styles and organize them into hierarchical structures.

Configuring Components

One of the main benefits of using composable components in front-end libraries such as React is their support of props. Component props (short for properties) allow you a great deal of flexibility in how you configure and customize your components, depending on how, where, and when they are used.

Penpot offers similar capabilities in a design tool with variants and overrides. You can switch variants, hide elements, change styles, swap nested components within instances, or even change the whole layout of a component, providing flexibility while maintaining the link to the original component.

Creating Flexible, Scalable Systems

Allowing you to modify Flex and Grid layouts in component instances is where Penpot really shines. However, the power of these layout features goes beyond the components themselves.

With Flex Layout and Grid Layout, you can build components that are much more faithful to their code and easier to modify and maintain. But having those powerful features at your fingertips means that you can also place your components in other Grid and Flex layouts. That’s a big deal as it allows you to test your components in scenarios much closer to their real environment. Directly in a design tool, you can see how your component would behave if you put it in various places on your website or app. This allows you to fine-tune how your components fit into a larger system. It can dramatically reduce friction between design and code and streamline the handoff process.

Generating Components Code

As Penpot’s components are just web-ready code, one of the greatest benefits of using it is how easily you can export code for your components. This feature, like all of Penpot’s capabilities, is completely free.

Using Penpot’s Inspect panel, you can quickly grab all the layout properties and styles as well as the full code snippets for all components.

Documentation And Annotations

To make design systems in Penpot even more maintainable, it includes annotation features to help you document your components. This is crucial for maintaining a clear design system and ensuring a smooth handoff to developers.

Summary

Penpot’s implementation of components and its support for real CSS layouts make it a standout tool for designers who want to work closely with developers. By embracing web standards and providing powerful, flexible components, Penpot enables designers to create more developer-friendly designs without sacrificing creativity or control.

All of Penpot’s features are completely free for both designers and developers. As open-source software, Penpot lets you fully own your design tool experience and makes it accessible for everyone, regardless of team size and budget.

Ready to dive in? You can explore the file used in this article by downloading it and importing into your Penpot account.

As the design tool landscape continues to evolve, Penpot is taking charge of bringing designers and developers closer together. Whether you’re a designer looking to understand the development process or a developer seeking to streamline your workflow with designers, Penpot’s component system is worth exploring.

How To Design Effective Conversational AI Experiences: A Comprehensive Guide

Category Image 062

Conversational AI is revolutionizing information access, offering a personalized, intuitive search experience that delights users and empowers businesses. A well-designed conversational agent acts as a knowledgeable guide, understanding user intent and effortlessly navigating vast data, which leads to happier, more engaged users, fostering loyalty and trust. Meanwhile, businesses benefit from increased efficiency, reduced costs, and a stronger bottom line. On the other hand, a poorly designed system can lead to frustration, confusion, and, ultimately, abandonment.

Achieving success with conversational AI requires more than just deploying a chatbot. To truly harness this technology, we must master the intricate dynamics of human-AI interaction. This involves understanding how users articulate needs, explore results, and refine queries, paving the way for a seamless and effective search experience.

This article will decode the three phases of conversational search, the challenges users face at each stage, and the strategies and best practices AI agents can employ to enhance the experience.

The Three Phases Of Conversational Search

To analyze these complex interactions, Trippas et al. (2018) (PDF) proposed a framework that outlines three core phases in the conversational search process:

  1. Query formulation: Users express their information needs, often facing challenges in articulating them clearly.
  2. Search results exploration: Users navigate through presented results, seeking further information and refining their understanding.
  3. Query re-formulation: Users refine their search based on new insights, adapting their queries and exploring different avenues.

Building on this framework, Azzopardi et al. (2018) (PDF) identified five key user actions within these phases: reveal, inquire, navigate, interrupt, interrogate, and the corresponding agent actions — inquire, reveal, traverse, suggest, and explain.

In the following sections, I’ll break down each phase of the conversational search journey, delving into the actions users take and the corresponding strategies AI agents can employ, as identified by Azzopardi et al. (2018) (PDF). I’ll also share actionable tactics and real-world examples to guide the implementation of these strategies.

Phase 1: Query Formulation: The Art Of Articulation

In the initial phase of query formulation, users attempt to translate their needs into prompts. This process involves conscious disclosures — sharing details they believe are relevant — and unconscious non-disclosure — omitting information they may not deem important or struggle to articulate.

This process is fraught with challenges. As Jakob Nielsen aptly pointed out,

“Articulating ideas in written prose is hard. Most likely, half the population can’t do it. This is a usability problem for current prompt-based AI user interfaces.”

— Jakob Nielsen

This can manifest as:

  • Vague language: “I need help with my finances.”
    Budgeting? Investing? Debt management?
  • Missing details: “I need a new pair of shoes.”
    What type of shoes? For what purpose?
  • Limited vocabulary: Not knowing the right technical terms. “I think I have a sprain in my ankle.”
    The user might not know the difference between a sprain and a strain or the correct anatomical terms.

These challenges can lead to frustration for users and less relevant results from the AI agent.

AI Agent Strategies: Nudging Users Towards Better Input

To bridge the articulation gap, AI agents can employ three core strategies:

  1. Elicit: Proactively guide users to provide more information.
  2. Clarify: Seek to resolve ambiguities in the user’s query.
  3. Suggest: Offer alternative phrasing or search terms that better capture the user’s intent.

The key to effective query formulation is balancing elicitation and assumption. Overly aggressive questioning can frustrate users, and making too many assumptions can lead to inaccurate results.

For example,

User: “I need a new phone.”

AI: “What’s your budget? What features are important to you? What size screen do you prefer? What carrier do you use?...”

This rapid-fire questioning can overwhelm the user and make them feel like they're being interrogated. A more effective approach is to start with a few open-ended questions and gradually elicit more details based on the user’s responses.

As Azzopardi et al. (2018) (PDF) stated in the paper,

“There may be a trade-off between the efficiency of the conversation and the accuracy of the information needed as the agent has to decide between how important it is to clarify and how risky it is to infer or impute the underspecified or missing details.”

Implementation Tactics And Examples

  • Probing questions: Ask open-ended or clarifying questions to gather more details about the user’s needs. For example, Perplexity Pro uses probing questions to elicit more details about the user’s needs for gift recommendations.

For example, after clicking one of the initial prompts, “Create a personal webpage,” ChatGPT added another sentence, “Ask me 3 questions first on whatever you need to know,” to elicit more details from the user.

  • Interactive refinement: Utilize visual aids like sliders, checkboxes, or image carousels to help users specify their preferences without articulating everything in words. For example, Adobe Firefly’s side settings allow users to adjust their preferences.

  • Suggested prompts: Provide examples of more specific or detailed queries to help users refine their search terms. For example, Nelson Norman Group provides an interface that offers a suggested prompt to help users refine their initial query.

For example, after clicking one of the initial prompts in Gemini, “Generate a stunning, playful image,” more details are added in blue in the input.

  • Offering multiple interpretations: If the query is ambiguous, present several possible interpretations and let the user choose the most accurate one. For example, Gemini offers a list of gift suggestions for the query “gifts for my friend who loves music,” categorized by the recipient’s potential music interests to help the user pick the most relevant one.

Phase 2: Search Results Exploration: A Multifaceted Journey

Once the query is formed, the focus shifts to exploration. Users embark on a multifaceted journey through search results, seeking to understand their options and make informed decisions.

Two primary user actions mark this phase:

  1. Inquire: Users actively seek more information, asking for details, comparisons, summaries, or related options.
  2. Navigate: Users navigate the presented information, browse through lists, revisit previous options, or request additional results. This involves scrolling, clicking, and using voice commands like “next” or “previous.”

AI Agent Strategies: Facilitating Exploration And Discovery

To guide users through the vast landscape of information, AI agents can employ these strategies:

  1. Reveal: Present information that caters to diverse user needs and preferences.
  2. Traverse: Guide the user through the information landscape, providing intuitive navigation and responding to their evolving interests.

During discovery, it’s vital to avoid information overload, which can overwhelm users and hinder their decision-making. For example,

User: “I’m looking for a place to stay in Tokyo.”

AI: Provides a lengthy list of hotels without any organization or filtering options.

Instead, AI agents should offer the most relevant results and allow users to filter or sort them based on their needs. This might include presenting a few top recommendations based on ratings or popularity, with options to refine the search by price range, location, amenities, and so on.

Additionally, AI agents should understand natural language navigation. For example, if a user asks, “Tell me more about the second hotel,” the AI should provide additional details about that specific option without requiring the user to rephrase their query. This level of understanding is crucial for flexible navigation and a seamless user experience.

Implementation Tactics And Examples

  • Diverse formats: Offer results in various formats (lists, summaries, comparisons, images, videos) and allow users to specify their preferences. For example, Gemini presents a summarized format of hotel information, including a photo, price, rating, star rating, category, and brief description to allow the user to evaluate options quickly for the prompt “I’m looking for a place to stay in Paris.”

  • Context-aware navigation: Maintain conversational context, remember user preferences, and provide relevant navigation options. For example, following the previous example prompt, Gemini reminds users of the potential next steps at the end of the response.

  • Interactive exploration: Use carousels, clickable images, filter options, and other interactive elements to enhance the exploration experience. For example, Perplexity offers a carousel of images related to “a vegetarian diet” and other interactive elements like “Watch Videos” and “Generate Image” buttons to enhance exploration and discovery.

  • Multiple responses: Present several variations of a response. For example, users can see multiple draft responses to the same query by clicking the “Show drafts” button in Gemini.

  • Flexible text length and tone. Enable users to customize the length and tone of AI-generated responses to better suit their preferences. For example, Gemini provides multiple options for welcome messages, offering varying lengths, tones, and degrees of formality.

Phase 3: Query Re-formulation: Adapting To Evolving Needs

As users interact with results, their understanding deepens, and their initial query might not fully capture their evolving needs. During query re-formulation, users refine their search based on exploration and new insights, often involving interrupting and interrogating. Query re-formulation empowers users to course-correct and refine their search.

  • Interrupt: Users might pause the conversation to:
    • Correct: “Actually, I meant a desktop computer, not a laptop.”
    • Add information: “I also need it to be good for video editing.”
    • Change direction: “I’m not interested in those options. Show me something else.”
  • Interrogate: Users challenge the AI to ensure it understands their needs and justify its recommendations:
    • Seek understanding: “What do you mean by ‘good battery life’?”
    • Request explanations: “Why are you recommending this particular model?”

AI Agent Strategies: Adapting And Explaining

To navigate the query re-formulation phase effectively, AI agents need to be responsive, transparent, and proactive. Two core strategies for AI agents:

  1. Suggest: Proactively offer alternative directions or options to guide the user towards a more satisfying outcome.
  2. Explain: Provide clear and concise explanations for recommendations and actions to foster transparency and build trust.

AI agents should balance suggestions with relevance and explain why certain options are suggested while avoiding overwhelming them with unrelated suggestions that increase conversational effort. A bad example would be the following:

User: “I want to visit Italian restaurants in New York.”

AI: Suggest unrelated options, like Mexican restaurants or American restaurants, when the user is interested in Italian cuisine.

This could frustrate the user and reduce trust in the AI.

A better answer could be, “I found these highly-rated Italian restaurants. Would you like to see more options based on different price ranges?” This ensures users understand the reasons behind recommendations, enhancing their satisfaction and trust in the AI's guidance.

Implementation Tactics And Examples

  • Transparent system process: Show the steps involved in generating a response. For example, Perplexity Pro outlines the search process step by step to fulfill the user’s request.

  • Explainable recommendations: Clearly state the reasons behind specific recommendations, referencing user preferences, historical data, or external knowledge. For example, ChatGPT includes recommended reasons for each listed book in response to the question “books for UX designers.”

  • Source reference: Enhance the answer with source references to strengthen the evidence supporting the conclusion. For example, Perplexity presents source references to support the answer.

  • Point-to-select: Users should be able to directly select specific elements or locations within the dialogue for further interaction rather than having to describe them verbally. For example, users can select part of an answer and ask a follow-up in Perplexity.

  • Proactive recommendations: Suggest related or complementary items based on the user’s current selections. For example, Perplexity offers a list of related questions to guide the user’s exploration of “a vegetarian diet.”

Overcoming LLM Shortcomings

While the strategies discussed above can significantly improve the conversational search experience, LLMs still have inherent limitations that can hinder their intuitiveness. These include the following:

  • Hallucinations: Generating false or nonsensical information.
  • Lack of common sense: Difficulty understanding queries that require world knowledge or reasoning.
  • Sensitivity to input phrasing: Producing different responses to slightly rephrased queries.
  • Verbosity: Providing overly lengthy or irrelevant information.
  • Bias: Reflecting biases present in the training data.

To create truly effective and user-centric conversational AI, it’s crucial to address these limitations and make interactions more intuitive. Here are some key strategies:

  • Incorporate structured knowledge
    Integrating external knowledge bases or databases can ground the LLM’s responses in facts, reducing hallucinations and improving accuracy.
  • Fine-tuning
    Training the LLM on domain-specific data enhances its understanding of particular topics and helps mitigate bias.
  • Intuitive feedback mechanisms
    Allow users to easily highlight and correct inaccuracies or provide feedback directly within the conversation. This could involve clickable elements to flag problematic responses or a “this is incorrect” button that prompts the AI to reconsider its output.
  • Natural language error correction
    Develop AI agents capable of understanding and responding to natural language corrections. For example, if a user says, “No, I meant X,” the AI should be able to interpret this as a correction and adjust its response accordingly.
  • Adaptive learning
    Implement machine learning algorithms that allow the AI to learn from user interactions and improve its performance over time. This could involve recognizing patterns in user corrections, identifying common misunderstandings, and adjusting behavior to minimize future errors.
Training AI Agents For Enhanced User Satisfaction

Understanding and evaluating user satisfaction is fundamental to building effective conversational AI agents. However, directly measuring user satisfaction in the open-domain search context can be challenging, as Zhumin Chu et al. (2022) highlighted. Traditionally, metrics like session abandonment rates or task completion were used as proxies, but these don’t fully capture the nuances of user experience.

To address this, Clemencia Siro et al. (2023) offer a comprehensive approach to gathering and leveraging user feedback:

  • Identify key dialogue aspects
    To truly understand user satisfaction, we need to look beyond simple metrics like “thumbs up” or “thumbs down.” Consider evaluating aspects like relevance, interestingness, understanding, task completion, interest arousal, and efficiency. This multi-faceted approach provides a more nuanced picture of the user’s experience.
  • Collect multi-level feedback
    Gather feedback at both the turn level (each question-answer pair) and the dialogue level (the overall conversation). This granular approach pinpoints specific areas for improvement, both in individual responses and the overall flow of the conversation.
  • Recognize individual differences
    Understand that the concept of satisfaction varies per user. Avoid assuming all users perceive satisfaction similarly.
  • Prioritize relevance
    While all aspects are important, relevance (at the turn level) and understanding (at both the turn and session level) have been identified as key drivers of user satisfaction. Focus on improving the AI agent’s ability to provide relevant and accurate responses that demonstrate a clear understanding of the user’s intent.

Additionally, consider these practical tips for incorporating user satisfaction feedback into the AI agent’s training process:

  • Iterate on prompts
    Use user feedback to refine the prompts to elicit information and guide the conversation.
  • Refine response generation
    Leverage feedback to improve the relevance and quality of the AI agent’s responses.
  • Personalize the experience
    Tailor the conversation to individual users based on their preferences and feedback.
  • Continuously monitor and improve
    Regularly collect and analyze user feedback to identify areas for improvement and iterate on the AI agent’s design and functionality.
The Future Of Conversational Search: Beyond The Horizon

The evolution of conversational search is far from over. As AI technologies continue to advance, we can anticipate exciting developments:

  • Multi-modal interactions
    Conversational search will move beyond text, incorporating voice, images, and video to create more immersive and intuitive experiences.
  • Personalized recommendations
    AI agents will become more adept at tailoring search results to individual users, considering their past interactions, preferences, and context. This could involve suggesting restaurants based on dietary restrictions or recommending movies based on previously watched titles.
  • Proactive assistance
    Conversational search systems will anticipate user needs and proactively offer information or suggestions. For instance, an AI travel agent might suggest packing tips or local customs based on a user’s upcoming trip.

When Friction Is A Good Thing: Designing Sustainable E-Commerce Experiences

Fotolia Subscription Monthly 4685447 Xl Stock

As lavish influencer lifestyles, wealth flaunting, and hauls dominate social media feeds, we shouldn’t be surprised that excessive consumption has become the default way of living. We see closets filled to the brim with cheap, throw-away items and having the latest gadget arsenal as signifiers of an aspirational life.

Consumerism, however, is more than a cultural trend; it’s the backbone of our economic system. Companies eagerly drive excessive consumption as an increase in sales is directly connected to an increase in profit.

While we learned to accept this level of material consumption as normal, we need to be reminded of the massive environmental impact that comes along with it. As Yvon Chouinard, founder of Patagonia, writes in a New York Times article:

“Obsession with the latest tech gadgets drives open pit mining for precious minerals. Demand for rubber continues to decimate rainforests. Turning these and other raw materials into final products releases one-fifth of all carbon emissions.”

— Yvon Chouinard

In the paper, Scientists’ Warning on Affluence, a group of researchers concluded that reducing material consumption today is essential to avoid the worst of the looming climate change in the coming years. This need for lowering consumption is also reflected in the UN’s Sustainability goals, specifically Goal 17, “Ensuring sustainable consumption and production patterns”.

For a long time, design has been a tool for consumer engineering by for example, designing products with artificially limited useful life (planned obsolescence) to ensure continuous consumption. And if we want to understand specifically UX design’s role in influencing how much and what people buy, we have to take a deeper look at pushy online shopping experiences.

Design Shaping Shopping Habits: The Problem With Current E-commerce Design

Today, most online shopping experiences are designed with persuasion, gamification, nudging and even deception to get unsuspecting users to add more things to their basket.

There are “Hurry, only one item left in stock” type messages and countdown clocks that exploit well-known cognitive biases to nudge users to make impulse purchase decisions. As Michael Keenan explains,

“The scarcity bias says that humans place a higher value on items they believe to be rare and a lower value on things that seem abundant. Scarcity marketing harnesses this bias to make brands more desirable and increase product sales. Online stores use limited releases, flash sales, and countdown timers to induce FOMO — the fear of missing out — among shoppers.”

— Michael Keenan

To make buying things quick and effortless, we remove friction from the checkout process, for example, with the one-click-buy button. As practitioners of user-centered design, we might implement the button and say: thanks to this frictionless and easy checkout process, we improved the customer experience. Or did we just do a huge disservice to our users?

Gliding through the checkout process in seconds leaves no time for the user to ask, “Do I actually want this?” or “Do I have the money for this?”. Indeed, putting users on autopilot to make thoughtless decisions is the goal.

As a business.com article says: “Click to buy helps customers complete shopping within seconds and reduces the amount of time they have to reconsider their purchase.”

Amanda Mull writes from a user perspective about how it has become “too easy to buy stuff you don’t want”:

“The order took maybe 15 seconds. I selected my size and put the shoes in my cart, and my phone automatically filled in my login credentials and added my new credit card number. You can always return them, I thought to myself as I tapped the “Buy” button. [...] I had completed some version of the online checkout process a million times before, but I never could remember it being quite so spontaneous and thoughtless. If it’s going to be that easy all the time, I thought to myself, I’m cooked.”

— Amanda Mull

This quote also highlights that this thoughtless consumption is not only harmful to the environment but also to the very same user we say we center our design process around. The rising popularity of buy-now-pay-later services, credit card debt, and personal finance gurus to help “Overcoming Overspending” are indicators that people are spending more than they can afford, a huge source of stress for many.

The one-click-buy button is not about improving user experience but building an environment where users are “more likely to buy more and buy often.” If we care to put this bluntly, frictionless and persuasive e-commerce design is not user-centered but business-centered design.

While it is not unusual for design to be a tool to achieve business goals, we, designers, should be clear about who we are serving and at what cost with the power of design. To reckon with our impact, first, we have to understand the source of power we yield — the power asymmetry between the designer and the user.

Power Asymmetry Between User And Designer

Imagine a scale: on one end sits the designer and the user on the other. Now, let’s take an inventory of the sources of power each party has in their hands in an online shopping situation and see how the scale balances.

Designers

Designers are equipped with knowledge about psychology, biases, nudging, and persuasion techniques. If we don’t have the time to learn all that, we can reach for an out-of-the-box solution that uses those exact psychological and behavioral insights. For example, Nudgify, a Woocommerce integration, promises to help “you get more sales and reduce shopping cart abandonment by creating Urgency and removing Friction.”

Erika Hall puts it this way: “When you are designing, you are making choices on behalf of other people.” We even have a word for this: choice architecture. Choice architecture refers to the deliberate crafting of decision-making environments. By subtly shaping how options are presented, choice architecture influences individual decision-making, often without their explicit awareness.

On top of this, we also collect funnel metrics, behavioral data, and A/B test things to make sure our designs work as intended. In other words, we control the environment where the user is going to make decisions, and we are knowledgeable about how to tweak it in a way to encourage the decisions we want the user to make. Or, as Vitaly Friedman says in one of his articles:

“We’ve learned how to craft truly beautiful interfaces and well-orchestrated interactions. And we’ve also learned how to encourage action to meet the project’s requirements and drive business metrics. In fact, we can make pretty much anything work, really.”

— Vitaly Friedman

User

On the other end of the scale, we have the user who is usually unaware of our persuasion efforts, oblivious about their own biases, let alone understanding when and how those are triggered.

Luckily, regulation around Deceptive Design on e-commerce is increasing. For example, companies are not allowed to use fake countdown timers. However, these regulations are not universal, and enforcement is lax, so often users are still not protected by law against pushy shopping experiences.

After this overview, let’s see how the scale balances:

When we understand this power asymmetry between designer and user, we need to ask ourselves:

  • What do I use my power for?
  • What kind of “real life” user behavior am I designing for?
  • What is the impact of the users’ behavior resulting from my design?

If we look at e-commerce design today, more often than not, the unfortunate answer is mindless and excessive consumption.

This needs to change. We need to use the power of design to encourage sustainable user behavior and thus move us toward a sustainable future.

What Is Sustainable E-commerce?

The discussion about sustainable e-commerce usually revolves around recyclable packaging, green delivery, and making the site energy-efficient with sustainable UX. All these actions and angles are important and should be part of our design process, but can we build a truly sustainable e-commerce if we are still encouraging unsustainable user behavior by design?

To achieve truly sustainable e-commerce, designers must shift from encouraging impulse purchases to supporting thoughtful decisions. Instead of using persuasion, gamification, and deception to boost sales, we should use our design skills to provide users with the time, space, and information they need to make mindful purchase decisions. I call this approach Kind Commerce.

But The Business?!

While the intent of designing Kind Commerce is noble, we have a bitter reality to deal with: we live and work in an economic system based on perpetual growth. We are often measured on achieving KPIs like “increased conversion” or “reduced cart abandonment rate”. We are expected to use UX to achieve aggressive sales goals, and often, we are not in a position to change that.

It is a frustrating situation to be in because we can argue that the system needs to change, so it is possible for UXers to move away from persuasive e-commerce design. However, system change won’t happen unless we push for it. A catch-22 situation. So, what are the things we could do today?

  • Pitch Kind Commerce as a way to build strong customer relationships that will have higher lifetime value than the quick buck we would make with persuasive tricks.
  • Highlight reduced costs. As Vitaly writes, using deceptive design can be costly for the company:
“Add to basket” is beautifully highlighted in green, indicating a way forward, with insurance added in automatically. That’s a clear dark pattern, of course. The design, however, is likely to drive business KPIs, i.e., increase a spend per customer. But it will also generate a wrong purchase. The implications of it for businesses might be severe and irreversible — with plenty of complaints, customer support inquiries, and high costs of processing returns.”

— Vitaly Friedman

Helping users find the right products and make decisions they won’t regret can help the company save all the resources they would need to spend on dealing with complaints and returns. On top of this, the company can save millions of dollars by avoiding lawsuits for unfair commercial practices.

  • Highlight the increasing customer demand for sustainable companies.
  • If you feel that your company is not open to change practices and you are frustrated about the dissonance between your day job and values, consider looking for a position where you can support a company or a cause that aligns with your values.
A Few Principles To Design Mindful E-commerce

Add Friction

I know, I know, it sounds like an insane proposition in a profession obsessed with eliminating friction, but hear me out. Instead of “helping” users glide through the checkout process with one-click buy buttons, adding a step to review their order and give them a pause could help reduce unnecessary purchases. A positive reframing for this technique could be helpful to express our true intentions.

Instead of saying “adding friction,” we could say “adding a protective step”. Another example of “adding a protective step” could be getting rid of the “Quick Add” buttons and making users go to the product page to take a look at what they are going to buy. For example, Organic Basics doesn’t have a “Quick Add” button; users can only add things to their cart from the product page.

Inform

Once we make sure users will visit product pages, we can help them make more informed decisions. We can be transparent about the social and environmental impact of an item or provide guidelines on how to care for the product to last a long time.

For example, Asket has a section called “Lifecycle” where they highlight how to care for, repair and recycle their products. There is also a “Full Transparency” section to inform about the cost and impact of the garment.

Design Calm Pages

Aggressive landing pages where everything is moving, blinking, modals popping up, 10 different discounts are presented are overwhelming, confusing and distracting, a fertile environment for impulse decisions.

Respect your user’s attention by designing pages that don’t raise their blood pressure to 180 the second they open them. No modals automatically popping up, no flashing carousels, and no discount dumping. Aim for static banners and display offers in a clear and transparent way. For example, H&M shows only one banner highlighting a discount on their landing page, and that’s it. If a fast fashion brand like H&M can design calm pages, there is no excuse why others couldn’t.

Be Honest In Your Messaging

Fake urgency and social proof can not only get you fined for millions of dollars but also can turn users away. So simply do not add urgency messages and countdown clocks where there is no real deadline behind an offer. Don’t use fake social proof messages. Don’t say something has a limited supply when it doesn’t.

I would even take this a step further and recommend using persuasion sparingly, even if they are honest. Instead of overloading the product page with every possible persuasion method (urgency, social proof, incentive, assuming they are all honest), choose one yet impactful persuasion point.

Disclaimer

To make it clear, I’m not advocating for designing bad or cumbersome user experiences to obstruct customers from buying things. Of course, I want a delightful and easy way to buy things we need.

I’m also well aware that design is never neutral. We need to present options and arrange user flows, and whichever way we choose to do that will influence user decisions and actions.

What I’m advocating for is at least putting the user back in the center of our design process. We read earlier that users think it is “too easy to buy things you don’t need” and feel that the current state of e-commerce design is contributing to their excessive spending. Understanding this and calling ourselves user-centered, we ought to change our approach significantly.

On top of this, I’m advocating for expanding our perspective to consider the wider environmental and social impact of our designs and align our work with the move toward a sustainable future.

Mindful Consumption Beyond E-commerce Design

E-commerce design is a practical example of how design is a part of encouraging excessive, unnecessary consumption today. In this article, we looked at what we can do on this practical level to help our users shop more mindfully. However, transforming online shopping experiences is only a part of a bigger mission: moving away from a culture where excessive consumption is the aspiration for customers and the ultimate goal of companies.

As Cliff Kuang says in his article,

“The designers of the coming era need to think of themselves as inventing a new way of living that doesn’t privilege consumption as the only expression of cultural value. At the very least, we need to start framing consumption differently.”

— Cliff Kuang

Or, as Manuel Lima puts in his book, The New Designer,

“We need the design to refocus its attention where it is needed — not in creating things that harm the environment for hundreds of years or in selling things we don’t need in a continuous push down the sales funnel but, instead, in helping people and the planet solve real problems. [...] Designs’s ultimate project is to reimagine how we produce, deliver, consume products, physical or digital, to rethink the existing business models.”

— Manuel Lima

So buckle up, designers, we have work to do!

To Sum It Up

Today, design is part of the problem of encouraging and facilitating excessive consumption through persuasive e-commerce design and through designing for companies with linear and exploitative business models. For a liveable future, we need to change this. On a tactical level, we need to start advocating and designing mindful shopping experiences, and on a strategic level, we need to use our knowledge and skills to elevate sustainable businesses.

I’m not saying that it is going to be an easy or quick transition, but the best time to start is now. In a dire state of need for sustainable transformation, designers with power and agency can’t stay silent or continue proliferating the problem.

“As designers, we need to see ourselves as gatekeepers of what we are bringing into the world and what we choose not to bring into the world. Design is a craft with responsibility. The responsibility to help create a better world for all.”

— Mike Monteiro

This Is How SSL Certificates Work: HTTPS Explained in 15 Minutes

Featured Imgs 23

The world of online security may seem complex, but understanding the basics of how SSL certificates work and why HTTPS is essential can empower you to make safer choices online. Just like Jane, you can navigate the digital landscape with confidence, knowing that your data is protected from prying eyes. So next time you browse the web, remember the story of Jane and the coffee shop hacker and choose secure, trusted websites for your online activities. Let’s start our day with Jane who was enjoying her coffee peacefully.

Chapter 1: The Coffee Shop Conundrum

It was a sunny afternoon, and Jane decided to take a break from her hectic day. She headed to her favorite coffee shop, ordered a latte, and found a cozy corner to catch up on some online shopping and emails. As she settled in, she connected her laptop to the coffee shop’s free Wi-Fi and began browsing. Little did she know, a hacker named Bob was sitting just a few tables away, eager to intercept her data.

Chris’ Corner: Incremental Adoption

Fotolia Subscription Monthly 4685447 Xl Stock

One of the reasons I can’t stop thinking about native Web Components is how you can use them anywhere. “Incremental adoption” is the fancy phrase, I suppose.

We’ve even started using them on the new editor for CodePen we’re still hard at work on to solve some interesting issues I’m sure we’ll talk about someday. We’re using React/Next there, which isn’t famous for it’s support of Web Components, but it’s mostly been fine. Other JavaScript frameworks are much more friendly, and of course if you aren’t using a JavaScript framework you haven’t a care in the world; Web Components can be part of your world.

Because they are pretty easy to slip in anywhere, they work pretty well on CodePen. I mentioned a list of “stand alone” Web Components the other week here at the ol’ Corner, then I converted them into Pens just to prove that. A lot of web components are published to npm to using a site like esm.sh makes linking up the resources pretty easy.

What’s cool about using web components like I have above is that they just might last forever. I ranted a little about this on Mastodon the other day. The fewer dependencies a web component has, the longer it will last. It will certainly outlive your JavaScript framework, as Jake Lazaroff put it:

If we want our work to be accessible in five or ten or even 20 years, we need to use the web with no layers in between. For all its warts, the web has become the most resilient, portable, future-proof computing platform we’ve ever created — at least, if we build with that in mind.

I think that’s cool.

I makes me think what will break about those demos I posted. Like, what is going to make this Pen stop working someday? If CodePen goes offline, it will. But we’ve just had our 12th birthday are are going strong. You’d have to fight me to the death for that to happen. The web component is linked up from esm.sh so if that went down it would stop working. That’s definitely possible, we’ve seen free CDN-like websites like this come and go. But you could just change to a different one. The code is on npm, so that could die or the author could pull it down. But there doesn’t seem to be a lot of risk of that, and it’s open source so mirrors will exist. Pretty resilient, I’d say! Although different projects have different needs there and you could always get stronger by reducing even those dependencies.

Oh hey speaking of web components and things that are super cool… check out David Darnes new one just for us: <code-pen>.

The idea is that it’s a convienient way to use our Post to Prefill API. So you’d author code like this:

<script type="module" src="code-pen.js"></script>

<code-pen>
  <pre>
    <code>&lt;p&gt;Hello world&lt;/p&gt;</code>
  </pre>
  <pre>
    <code>:root { color: hotpink; }</code>
  </pre>
  <pre>
    <code>document.querySelector(&quot;p&quot;).style.backgroundColor = &quot;orange&quot;;</code>
  </pre>
</code-pen>

(Or you could use Markdown triple backticks and avoid the code escaping, among other options)

Then it builds an “Open in CodePen” button that users can click to open that code in a real Pen. The point is usually stuff like documentation and blog posts where you want to manage all the code yourself, not maintain both the text and the Pens separately. Thanks David, this is super cool.

David also recently published a free eBook called The Case for Web Components you might want to check out if you’re looking to be further convinced or need more learnings on the basic to decide. In it, he mentioned the use case of “design systems”, which seems like an awful big one to me. These days, if you’re creating a design system, doing it in anything other than web components seems weird. Web components will last and be movable between frameworks as your project evolves, with little if any downside.

While we’re on the subject let’s just get a little more into the weeds.

Ultimately a Web Component is a class in JavaScript. It has standard methods, like a constructor that is called when the class is instantiated. It also has a method connectedCallback that runs when on each instance of a web component when it shows up in the DOM. This is a bit of a subtle difference and can be quite a gotcha, as Nolan Lawson says. If your component needs to do stuff that might be unique to each instance, that belongs in connectedCallback. I think it’s a real superpower of web components! For example, it’s kinda like getting event delegation for free.

There is an approach to web components called HTML web components which is essentially:

  1. Take some perfectly acceptable HTML
  2. Wrap it in a <web-component> to extend the functionality

That’s fun and obviously useful (e.g. a blog post with a web component would render fine over RSS). It’s related but not quite the same declarative shadow DOM. Raymond Camden explores something interesting here… how does a web component know if the HTML inside it changes? Unfortunately there is no obvious solution or helper API for this… other than the platform itself. The trick is that you use a MutationObserver on itself internally to watch for changes and then do whatever you gotta do. Interesting stuff. I’d be tempted to rip the whole thing out of the DOM and replace it just so connectedCallback does it’s thing, rather than craft every web component such that it’s watching for it’s own changes.

I hadn’t really thought that much about watching for changes like that before, as it just hasn’t come up for me. Similarly, Ben Nadel points out that the contents of a <template> can be mutated at any time, and new components instantiated off it will use that new content. That makes sense to me, it’s just a twist of how I normally think of components. I think of the template as this static thing which takes data and does what it needs to do. Less so do I think of a template itself that is dynamic based on data.

Useful Customer Journey Maps (+ Figma & Miro Templates)

Fotolia Subscription Monthly 4685447 Xl Stock

User journey maps are a remarkably effective way to visualize the user’s experience for the entire team. Instead of pointing to documents scattered across remote fringes of Sharepoint, we bring key insights together — in one single place.

Let’s explore a couple of helpful customer journey templates to get started and how companies use them in practice.

This article is part of our ongoing series on UX. You might want to take a look at Smart Interface Design Patterns 🍣 and the upcoming live UX training as well. Use code BIRDIE to save 15% off.

AirBnB Customer Journey Blueprint

AirBnB Customer Journey Blueprint (also check Google Drive example) is a wonderful practical example of how to visualize the entire customer experience for two personas, across eight touch points, with user policies, UI screens and all interactions with the customer service — all on one single page.

Now, unlike AirBnB, your product might not need a mapping against user policies. However, it might need other lanes that would be more relevant for your team. For example, include relevant findings and recommendations from UX research. List key actions needed for the next stage. Include relevant UX metrics and unsuccessful touchpoints.

Whatever works for you, works for you — just make sure to avoid assumptions and refer to facts and insights from research.

Spotify Customer Journey Map

Spotify Customer Journey Blueprint (high resolution) breaks down customer experiences by distinct user profiles, and for each includes mobile and desktop views, pain points, thoughts, and actions. Also, notice branches for customers who skip authentication or admin tasks.

Getting Started With Journey Maps

To get started with user journey maps, we first choose a lens: Are we reflecting the current state or projecting a future state? Then, we choose a customer who experiences the journey — and we capture the situation/goals that they are focusing on.

Next, we list high-level actions users are going through. We start by defining the first and last stages and fill in between. Don’t get too granular: list key actions needed for the next stage. Add the user’s thoughts, feelings, sentiments, and emotional curves.

Eventually, add user’s key touchpoints with people, services, tools. Map user journey across mobile and desktop screens. Transfer insights from other research (e.g., customer support). Fill in stage after stage until the entire map is complete.

Then, identify pain points and highlight them with red dots. Add relevant jobs-to-be-done, metrics, channels if needed. Attach links to quotes, photos, videos, prototypes, Figma files. Finally, explore ideas and opportunities to address pain points.

Free Customer Journey Maps Templates (Miro, Figma)

You don’t have to reinvent the wheel from scratch. Below, you will find a few useful starter kits to get up and running fast. However, please make sure to customize these templates for your needs, as every product will require its own specific details, dependencies, and decisions.

Wrapping Up

Keep in mind that customer journeys are often non-linear, with unpredictable entry points and integrations way beyond the final stage of a customer journey map. It’s in those moments when things leave a perfect path that a product’s UX is actually stress-tested.

So consider mapping unsuccessful touchpoints as well — failures, error messages, conflicts, incompatibilities, warnings, connectivity issues, eventual lock-outs and frequent log-outs, authentication issues, outages, and urgent support inquiries.

Also, make sure to question assumptions and biases early. Once they live in your UX map, they grow roots — and it might not take long until they are seen as the foundation of everything, which can be remarkably difficult to challenge or question later. Good luck, everyone!

Meet Smart Interface Design Patterns

If you are interested in UX and design patterns, take a look at Smart Interface Design Patterns, our 10h-video course with 100s of practical examples from real-life projects — with a live UX training later this year. Everything from mega-dropdowns to complex enterprise tables — with 5 new segments added every year. Jump to a free preview. Use code BIRDIE to save 15% off.

Meet Smart Interface Design Patterns, our video course on interface design & UX.

100 design patterns & real-life examples.
10h-video course + live UX training. Free preview.

Best Free Shopify Templates That Will Elevate Your E-Commerce Store

Featured Imgs 23

Kickstarting an e-commerce store can feel like trying to build a castle out of LEGO blocks without a manual. Whether you’re a seasoned entrepreneur or a novice, the journey is riddled with choices, among which selecting the right theme tops the list. Fear not, the world of Shopify is a goldmine brimming with dazzling yet completely free themes. They can make your store look like a million bucks, even if you’re still waiting for your first sale.

Think of free Shopify themes as your store’s virtual wardrobe – you wouldn’t dress your shop in pajamas, right? With countless options at your fingertips, it’s as if you have an online stylist at your service. These templates bring style and functionality without the hefty price tag. Plus, they’re an absolute cinch to install.

Ready to dive into the slick, stylish, and snazzy world of Shopify themes? Buckle up, because we’ve got the lowdown on the best free templates to make your site shine brighter than a disco ball at Studio 54.

Why do you need a custom Shopify theme?

Imagine walking into a grocery store where aisles twist like a labyrinth and bananas are shelved next to light bulbs. Chaos, right? That’s what a default Shopify theme can feel like. A custom theme transforms your store from that chaotic mess into a seamless shopping experience – think IKEA without the frustration.

It’s not just about looks, though you’ll definitely turn heads with a snazzy setup. A tailored theme aligns with your brand’s vibe, making customers feel right at home, like they’re visiting a favorite café. Plus, you get to flaunt unique features that can boost sales quicker than a cat video goes viral.

In the crowded market, standing out isn’t optional; it’s survival. So, treat your store like that Instagram-perfect coffee shop. The right theme sets the stage for delightful discoveries and ensures your customers keep coming back for more lattes – and purchases.

The Significance of a Quality Shopify Theme

Imagine your Shopify store as a superhero; its theme is the cape. A quality theme doesn’t just swoosh dramatically—it gets things done. Flashy looks? Absolutely. But more importantly, it makes shopping a breeze. Think of it as offering an all-access pass to your products without the annoying velvet ropes.

Why settle for a default when you can go deluxe? A high-caliber theme means faster load times. Shoppers are like goldfish—easily distracted. Keep ’em hooked with a snappy interface.

Responsiveness? That’s non-negotiable. Your theme needs to look sharp on any device, just like you in your LinkedIn profile pic. Don’t forget SEO-friendly designs. Your store should be a magnet for search engines, pulling in customers like free samples at Costco.

In short, a quality Shopify theme is your behind-the-scenes co-star, ensuring every visitor has an unforgettable, purchase-inducing experience.💰

How to pick the best Shopify theme for your store?

Ready to supercharge your Shopify store with an epic theme but overwhelmed by the sea of options? No sweat, you’re not alone. Start with your vibe. Are you selling high-end fashion or quirky handmade crafts? Your theme should scream your brand’s personality louder than a teenager at a concert.

Next, check the functionality. Test drive those demos like you’re at a car show. Look for features you need—sliders, product zooms, Instagram feeds—you name it.

Don’t forget mobile. Seriously, most people shop on their phones while doing everything else. Make sure your theme looks sleek on any screen size.

Reviews are your friends. If Sally from Iowa found it helpful, you might too. But take ’em with a pinch of salt; some folks are never satisfied.

Lastly, support. Free stuff is great until it breaks. Ensure there’s a helpful community or some support for those inevitable hiccups.

Follow these pointers and you’ll pick a theme that’s as fabulous as your products. 🕶️

Why free Shopify themes are a good option?

Sure, who doesn’t love free stuff? Free Shopify themes are like finding a $20 bill in your old jeans. They help you save some serious cash while offering a decent range of customization. Money saved on themes means more budget for killer marketing campaigns, right?

Plus, many free themes are backed by Shopify’s solid dev team. You won’t get stuck with some buggy afterthought of a theme. These templates are user-focused and often just as snazzy as their paid counterparts. They can give your store a polished look without squeezing your wallet dry.

If you’re just dipping your toes into the e-commerce ocean, free themes are a no-brainer. They let you experiment without financial commitment. And hey, if your store blows up overnight, you can always upgrade later. It’s all about working smart, not hard!

Tips for Choosing the Right Template   

Choosing the right Shopify theme isn’t rocket science, but a little insider wisdom never hurt anyone, right? First, think about your brand style. If your store is all about Gothic fashion, a pastel floral theme won’t do the trick.

Keep an eye on functionality— it’s like checking for the engine before buying a car. Does it support all your desired features? Integrated social media? A snazzy product slider? Check the reviews like you’re stalking an ex’s Facebook. See what other users say about it.

Make sure it looks good on mobile. Folks are more likely to shop while waiting in line for their coffee than sitting at a desktop. Test drive different themes with your actual content. Sometimes what looks like gold turns out to be glitter.

And finally, don’t stress too much. It’s okay to start simple and upgrade later. After all, even Rome wasn’t built on a premium template!

Conclusion

Ready to wrap this up? Free Shopify templates can be lifesavers for new store owners with a tight budget. Don’t worry; you won’t sacrifice style or functionality. Imagine these themes as your e-commerce Swiss Army knife: they’ve got it all, and they don’t cost a dime.

Remember, your site’s look is its first impression. Make it count! Test different free themes like you’re trying on outfits before a big date. Do they fit your brand? Are they responsive? If your theme doesn’t make your products shine, swipe left.

Stay flexible. If your first choice doesn’t attract customers, try another. After all, even astronauts do test runs!

So go ahead, dive into the pool of the best free Shopify templates. You might find the perfect one faster than you can say, “Shopify theme for free!” Your future self—and your sales—will thank you.

Dawn

Dawn

Source Live Preview

Refresh

Refresh

Source Live Preview

Sense

Sense

Source Live Preview

Origin

Origin

Source Live Preview

Ride

Ride

Source Live Preview

Craft

Craft

Source Live Preview

Spotlight

Spotlight

Source Live Preview

Taste

Taste

Source Live Preview

Colorblock

Colorblock

Source Live Preview

Publisher

Publisher

Source Live Preview

Crave

Crave

Source Live Preview

Studio

Studio

Source Live Preview

Trade

Trade

Source Live Preview

Fashe

Fashe

Source Live Preview

Thalia

Thalia

Source Live Preview

Vendy Shopping Store Theme

Vendy Shopping Store Theme

Source Live Preview

Voonex

Voonex

Source Live Preview

The post Best Free Shopify Templates That Will Elevate Your E-Commerce Store appeared first on CSS Author.