Semantic Search With Weaviate Vector Database

Featured Imgs 23

In a previous blog, the influence of the document format and the way it is embedded in combination with semantic search was discussed. LangChain4j was used to accomplish this. The way the document was embedded has a major influence on the results. This was one of the main conclusions. However, a perfect result was not achieved. 

In this post, you will take a look at Weaviate, a vector database that has a Java client library available. You will investigate whether better results can be achieved.

How To Embed Documents for Semantic Search

Featured Imgs 23

In this post, you will take a closer look at embedding documents to be used for a semantic search. By means of examples, you will learn how embedding influences the search result and how you can improve the results. Enjoy!

Introduction

In a previous post, a chat with documents using LangChain4j and LocalAI was discussed. One of the conclusions was that the document format has a large influence on the results. In this post, you will take a closer look at the influence of source data and the way it is embedded in order to get a better search result.

Running LLMs Locally: A Step-by-Step Guide

Featured Imgs 23

In this post, you will take a closer look at LocalAI, an open-source alternative to OpenAI that allows you to run LLMs on your local machine. No GPU is needed: consumer-grade hardware will suffice. Enjoy!

Introduction

OpenAI is a great tool. However, you may not be allowed to use it due to company policies because you might send sensitive information to OpenAI. Besides that, you might want to experiment with different kinds of LLMs (Large Language Models). Wouldn’t it be great if you could run models locally using the same Rest API as for OpenAI? Well, that is exactly what LocalAI has to offer you! LocalAI is an open-source alternative to OpenAI and has a Rest API which is compatible with the OpenAI API specifications. Besides that, no GPU is needed, you can run it on consumer-grade hardware. It is advised, however, to use a GPU, because it will be approximately 20 times faster.

How To Use Ansible Roles

Featured Imgs 23

In this article, you will learn the basics of Ansible Roles. With Ansible Roles, you can reuse Ansible content you create and share them with other users. You will learn about Ansible Roles step-by-step by means of examples. Enjoy!

Introduction

In the three previous Ansible posts, you learned how to setup an Ansible test environment, how to create an Ansible inventory, and how to create an Ansible playbook.

Devoxx Belgium 2022 Takeaways

Featured Imgs 23

In October 2022, I visited Devoxx Belgium after two cancelled editions due to COVID-19. I learned a lot and received quite some information which I do not want to withhold from you. In this blog, you can find my takeaways of Devoxx Belgium 2022!

1. Introduction

Devoxx Belgium is the largest Java conference in Europe. This year, it was already the 19th edition. As always, Devoxx is being held in the fantastic theatres of Kinepolis Antwerp. The past two editions were cancelled due to COVID-19. As a result, there was a rush on the tickets. The first batch of tickets was sold out in 5 minutes, the second batch in a few seconds. Reactions on Twitter mentioned that it looked like a ticket sale for Beyonce.

Main Benefits of a Technical Blog

Fotolia Subscription Monthly 4685447 Xl Stock

This blog is a special edition because it is my 100th blog! I will explain what this blog has given me in the past five years. If you are planning to start a blog of yourself, you may use this list of benefits in order to get you motivated to get started.

1. Introduction

In the beginning of September, I already celebrated the fifth anniversary of my blog. Now I publish my 100th blog and I am pretty proud of it. It seems not so long ago that I started my blog, but on the other hand, it also feels like I am doing this for a long time. At least, I cannot imagine a life without my blog anymore. In the beginning, I really suffered of the imposter syndrome: I posted blogs, but did not let anyone in my direct environment know that I had a blog. After a few months, I let this feeling behind myself and let the world know that I write technical content. In those five years, I only had one or two negative comments, but many positive comments and I really do not bother about the negative ones. In the next section, I will try to list some of the benefits of a technical blog and if you would like to start with a blog yourself, do read Why Start a Technical Blog. Enjoy this post and up to the next five years!

How to Setup an Ansible Test Environment

Featured Imgs 23

When you want to experiment with Ansible, you will need to setup a test environment. In this blog, you will create a test environment containing one controller and two target machines. This will give you a good setup for experimenting with Ansible without breaking a real machine.

1. Introduction

With Ansible, you can automate repetitive IT tasks and because it is automated, it will also prevent you from making mistakes. Especially when you have to configure several similar environments. The other main advantage is that the configuration is maintained in files and therefore extremely suitable for adding the configuration to version control (e.g. Git). However, in every learning path you need to be able to experiment in order to make mistakes and to learn. In this blog, you will setup an Ansible controller machine and two target machines running in VirtualBox. The Ansible Controller will be the machine where to run the Ansible playbooks from and the target machines will be where tasks can be executed. The test setup looks as follows.

How to Generate Fake Test Data

Featured Imgs 23

Are you also often uninspired when you need to think of useful test data for your unit tests? Is ‘John Doe’ your best test friend? Do not worry, Java Faker comes to the rescue! In this blog, you will learn how to generate your test data.

1. Introduction

Making up test data is one of the hardest tasks when writing tests. Often you will see 123 when numbers are being used, or John Doe when a name is needed. But this also means that the test will always run with the same data. This is on the one hand a good thing because your tests needs to be stable, but on the other hand a pitty because you also want to find errors. And this is more likely when random test data is being used.