Neural Network Archives - Nexis

ChatGPT – an assistant for a programmer? An example of a real-world task: Neural network square recognition

James — Wed, 17 May 2023 07:03:06 +0000

No matter how you look at it, the ChatGPT language model can never completely replace a programmer, because only about 1/10 of the total development time is spent writing code. However, ChatGPT is great for helping with various aspects of programming. The more skills and experience a programmer has, the more useful an “assistant” can be:

Perform code optimization and improve performance.

Find and fix bugs in the code.

Explain complex concepts and algorithms.

Assist in developing ideas and choosing the right architecture.

Create prototypes and demos of programs.

Give advice on programming style and best practices.

Automate repetitive tasks.

Generate code based on specifications or specified parameters.

Extend functionality with plugins and tools.

Write documentation and comments to the code.

Today it’s incredibly stupid not to use the features of ChatGPT. It really is a universal assistant, which greatly simplifies the life of a programmer and increases the efficiency of development. This programming becomes a much more pleasant and efficient business than ever before.

The post ChatGPT – an assistant for a programmer? An example of a real-world task: Neural network square recognition appeared first on Nexis.

How a neural network recognized landmarks on photo cards

James — Tue, 16 May 2023 10:11:33 +0000

The goal of the project was to recognize landmarks in photographs using machine learning, namely convolutional neural networks. This topic was chosen for the following reasons:

The author already had some experience with computer vision tasks

the task sounded as if it could be done very quickly without much effort and, what is important, without a lot of computing resources (all nets were trained in colab or in kagle)

the problem could have some practical application (well, in theory…)

At first it was planned as a purely educational project, but then I got into the idea of it and decided to refine it to the state that I can.

In what follows, I will talk about how I approached this task, and in doing so I will try to follow the code from the notebook where all the magic happened, while also trying to explain some of my actions. Maybe this will help someone get over their fear of the “blank slate” and see that this kind of thing is really easy to do!

Tools
First things first, let me tell you about the tools which were used for this project.

Colab/Kaggle: used to train networks on GPUs.

Weights And Biases: a service where I was saving models, their descriptions, adding losses, metrics values, training parameters, preprocessing. In general, I kept complete records. You can read the data here. The metadata section was slightly changed while writing the code – it actually contains the parameters of training and preprocessing. In the files section you can find a description of the network (how its layers are arranged), download the trained weights of the network and look at the values of losses and metrics.

Training data
Well, I should probably start by choosing the data to train the neural network. For this I searched data sets on Cagle (see here) and this site caught my eye.

Actually, as it turned out, there is a competition from Google, related just to the recognition of landmarks. Here was the first problem: dataset weighs \approx100gb. Realizing that the grids in the future I will learn not on my bakery, I had to give up this option. After some more research, I settled on this dataset. It contains 210 classes and about 50 pictures per class. The pictures are all different sizes, taken from different angles, from different distances. In general, the dataset is not refined at all, and so far I’ve only worked with these.

The post How a neural network recognized landmarks on photo cards appeared first on Nexis.

OpenAI studied GPT-2 with GPT-4 and tried to explain the behavior of neurons

James — Sat, 22 Apr 2023 11:34:40 +0000

Experts from OpenAI published a study in which they described how they tried to explain the work of neurons of its predecessor, GPT-2, using the GPT-4 language model. Now the company’s developers seek to advance in the “interpretability” of neural networks and understand why they create the content that we receive.

In the first sentence of their article, the authors from OpenAI admit: “Language models have become more functional and more pervasive, but we don’t understand how they work.” This “ignorance” of exactly how individual neurons in a neural network behave to produce output data is referred to as the “black box.” According to Ars Technica, trying to look inside the “black box,” researchers from OpenAI used their GPT-4 language model to create and evaluate natural-language explanations of neuronal behavior in a simpler language model, GPT-2. Ideally, having an interpretable AI model would help achieve a more global goal called “AI matching.” In this case, we would have assurances that AI systems would behave as intended and reflect human values.

OpenAI wanted to figure out which patterns in the text cause neuron activation, and moved in stages. The first step was to explain neuron activation using GPT-4. The second was to simulate neuronal activation with GPT-4, given the explanation from the first step. The third was to evaluate the explanation by comparing simulated and real activations. GPT-4 identified specific neurons, neural circuits, and attention heads, and generated readable explanations of the roles of these components. The large language model also generated an explanation score, which OpenAI calls “a measure of the ability of the language model to compress and reconstruct neuronal activations using natural language.”

During the study, OpenAI offered to duplicate the work of GPT-4 in humans and compared their results. As the authors of the article admitted, both the neural network and the human “performed poorly in absolute terms.”

One explanation for this failure, suggested by OpenAI, is that neurons can be “polysemantic,” meaning that a typical neuron in the context of a study can have multiple meanings or be associated with multiple concepts. In addition, language patterns may contain “alien concepts” for which people simply do not have words. This paradox could arise for various reasons: for example, because language models care about the statistical constructs used to predict the next token; or because the model has discovered natural abstractions that people have yet to discover, such as a family of similar concepts in non-comparable domains.

The bottom line at OpenAI is that not all neurons can be explained in natural language; and so far, researchers can only see correlations between input data and the interpreted neuron at a fixed distribution, with past scientific work showing that this may not reflect a causal relationship between the two. Despite this, the researchers are quite optimistic and confident that they have succeeded in laying the groundwork for machine interpretability. They have now posted on GitHub the code for the automatic interpretation system, the GPT-2 XL neurons and the explanation data sets.

The post OpenAI studied GPT-2 with GPT-4 and tried to explain the behavior of neurons appeared first on Nexis.

How to structure machine learning projects using GitHub and VS Code: complete instructions with settings and templates

James — Wed, 01 Mar 2023 11:29:47 +0000

A well-designed process for structuring machine learning projects can help you create new GitHub repositories quickly and navigate an elegant software architecture from the start. The VS Cloud team has translated an article on how to organize files in machine learning projects using VS Code. A template for creating machine learning projects can be downloaded on GitHub.

Note

To create a new machine learning project from the GitHub template, go to the GitHub repository and click “Use this template”. GitHub template repositories are a very handy thing: they allow me and other users to generate new repositories with the same structure, branches, and files as the template.

The next page opens up project settings, such as repository name and privacy settings:

Having created the repository, click “Actions” on the top menu and wait a bit:

If a green checkmark appears, the project is ready – you can write code!
Next I’ll tell you why a particular file is added to the project and how the GitHub template was created.

Basic files

First, let’s look at the main files of the project, created on the basis of the template:

.gitignore

From the .gitignore file, GitHub draws information about which files to ignore when you commit a project to the GitHub repository. If you are creating a new repository from scratch, you can specify a pre-configured .gitignore file.

The post How to structure machine learning projects using GitHub and VS Code: complete instructions with settings and templates appeared first on Nexis.