Differences

This shows you the differences between two versions of the page.

--- neural_network [2024/02/14 12:03] – [3D pose estimation] jhagstrand
+++ neural_network [2024/08/27 10:43] (current) – [Orientation] jhagstrand
@@ Line 1: / Line 1: @@
 ====== Neural Network ======
+[[Gradient Descent]] \\
+[[Linear Algebra]]
 === Neural Network NN ===
@@ Line 22: / Line 25: @@
 https://curriculum.voyc.com/doku.php?id=tensorflow_python_examples#perceptron_neuron_and_gate
+Multilayer Perceptron
+Fully Connected Neural Network == Neural Network
+Is there a "partially connected neural network"?
+Neural Network == Multilayer Perceptron
+weighted sum of inputs and weights
+\begin{align*}z^{(i)} = w_{1}x_{1}^{(i)} + w_{2}x_{2}^{(i)} + .. .+ w_{k}x_{k}^{(i)} + b,\end{align*}
 ==== 2 layer nn ====
 Build a neural network from scratch using python and numpy
@@ Line 35: / Line 48: @@
 ==== Convolutional Neural Network CNN ====
+=== How it Works ===
+https://www.baeldung.com/cs/convolutional-vs-regular-nn
+IK means multiply I times K
+I * K means convolve K over I
+I is the input matrix
+K is the kernel or filter
+result of the convolution layer is the convolved feature map
+result of the pooling layer is the pooled feature map
+Unlike an artificial neuron in a fully-connected layer, a neuron in a convolutional layer is not connected to the entire input but just some section of the input data. These input neurons provide abstractions of small sections of the input data that, when combined over the entire input, we refer to as a feature map.
+== feature extraction ==
+Basically, the artificial neurons in CNN are arranged into 2D or 3D grids which we call filters. Usually, each filter extracts the different types of features from the input data. For example, from the image, one filter can extract edges, lines, circles, or more complex shapes.
+== convolutional function ==
+The process of extracting features uses a convolution function, and from that comes the name convolutional neural network. The figure below shows the matrix I to apply the convolution using filter K. This means that filter K passes through matrix I, and an element-by-element multiplication is applied between the corresponding element of the matrix I and filter K. Then, we sum the results of this multiplication into a number:
+=== Variations ===
 three primary object detectors you’ll encounter:
-	R-CNN and their variants, including the original R-CNN, Fast R- CNN, and Faster R-CNN
+  * R-CNN and their variants, including the original R-CNN, Fast R- CNN, and Faster R-CNN
-	Single Shot Detector (SSDs)
+  * Single Shot Detector (SSDs)
-	YOLO
+  * YOLO
 R-CNNs are one of the first deep learning-based object detectors and are an
@@ Line 79: / Line 122: @@
 ==== You Only Look Once YOLO ====
+YOLOv1 2015 Joseph Ched Redmon et al \\
+YOLOv2 2016 Redmon and Farhadi, aka YOLO9000 \\
+YOLOv3 2018 Redmon last version for Redmon, he bails out due to concerns about military applications \\
+YOLOv4 2020 Alexey Bochkovskiy et al: YOLOv4: Optimal Speed and Accuracy of Object Detection \\
+YOLOv5 Ultralytics, switch from DarkNet to PyTorch \\
+YOLOv6 Alexey Bochkovskiy et al \\
+YOLOv7 Alexey Bochkovskiy et al \\
+YOLOv8 Ultralytics \\
+YOLOv9 ? \\
+https://deci.ai/blog/history-yolo-object-detection-models-from-yolov1-yolov8/
 Invented by Joseph Chet Redmon
@@ Line 147: / Line 204: @@
 opencv - computer vision library
+=== Warning ===
+When using these packages simultaneously, I must watch out for the differences, for example:
+  - array dimensions reversed between numpy and opencv
+    - openCV: (x,y), (width, height), "columns major", as in image processing
+    - numpy : (y,x), (height, width), "rows major",    as in inear algebra
 ===== AI terminology =====
@@ Line 291: / Line 355: @@
 Kalman Filter IoU KFIoU
+pixel density - ppi, ppcm
+pixel intensity - grayscale 0 to 255, 255 is intense, 0 is not
+segmentation - Divide image into segments along object contours, so that each pixel is assigned to one segment.  Colorize the image to distinguish the segments.  Background segments may include grass, sky, road, building.   Foreground segments may include car, person, tree, sign.
+semantic segmentation - color identifies object class.  All objects of a class have the same color.
+instance segmentation - color identifies each individual object.  Each object has a unique color and an associated id.
+panoptic segmentation - semantic and instance segmentation combined.  Color identifies class, and each object has an associated id.
+object detection  - Draw a bounding box around each object.
+object classification - Identify the class of an object.  Often used when there is only one object in an image.  Classify it as cat or dog.  Sometimes one layer will identify the objects, and a classifier will be used to determine the class of each object.
+object localization - The bounding box identifies the location of an object within an image.
+object recognition - match the object to similar instances in a database of images.
+https://uploads-ssl.webflow.com/614c82ed388d53640613982e/64aeb4a43a30bf1bbefd523f_types%20of%20image%20segmentation.webp
+https://images.ctfassets.net/3viuren4us1n/15EpLEkXALLew4JYzSwplX/108b0d48a16b26e0db4c8193e2797091/G_-_J.jpg
+https://www.sentisight.ai/wp-content/uploads/2022/08/segmentation-example.png
+Note.  The contour comes first.  The bounding box can easily be drawn around the contour.
+framework
+model
+configuration
+backend
+blob
+outputs
@@ Line 375: / Line 480: @@
 ==== Pipeline ====
+proprocessing
@@ Line 386: / Line 491: @@
-==== Orientation ====
-Google search:
-aerial photo object orientation
@@ Line 401: / Line 503: @@
 https://www.youtube.com/watch?v=wg7uYDonGu0
+==== Orientation ====
+Google search:
+aerial photo object orientation
+===== LLM =====
+==== NLP ====
+==== AI Code Writing ====
+As of August 2024
+AI models and tools for code writing
+from Grok
+OpenAI: ChatGPT
+VS Code, GitHub, Copilot: All from the Microsoft world.
+nlp
+llm
+transformer
+capabilities:
+  * code suggestions
+  * autocompletion
+  * generate a function from a prompt
+stand-alone operation vs integrated with IDE
+LLM coding assistant
+  * the model
+  * the corpora
+    * source code
+    * multiple languages
+    * organized into categories and contexts
+  * the trained model
+  * the interface:
+    * chatbot
+    * virtual assistant
+    * code completion plugin for an IDE
+^ Company  ^ Product        ^  Open Source  ^ Languages  ^ Comment  ^
+| OpenAI   | ChatGPT        |      No       |            |          |
+| OpenAI   | Codex          |      No       |            |          |
+| GitHub   | Copilot        |      No       |            | Based on Codex, integrated into IDE's like VSCode.         |
+| xAI      | Grok           |      ?        |            |          |
+| Amazon   | CodeWhisperer  |      No       |            | For use with AWS  |
+| Tabnine  | Tabnine        |      No       |            |                   |
+?:Tabnine
+AI-based code completion with support for over 30 programming languages.
+It's known for its ability to run locally or in the cloud, providing flexibility in deployment.
+Tabnine also emphasizes privacy by allowing developers to host their own models.
+Meta:Code Llama
+free for both research and commercial use.
+Code Llama has been highlighted for its performance in coding tasks,
+even outperforming some versions of models like GPT-3.5 in certain benchmarks.
+?:DeepSeek-v2-Coder
+Mentioned for its impressive performance in coding tasks,
+this model has been recognized for producing 100% compilable Java code in some evaluations,
+indicating high-quality code generation capabilities.
+Anthropic: Claude
+While primarily known for its conversational abilities,
+Claude's latest iterations, like Claude 3.5, have been praised for coding proficiency,
+especially in understanding and generating code for less common libraries or languages.
+- **Local and Open-Source Models**:
+There's a growing trend towards using open-source models like those based on LLaMA, Pythia,
+or even customized versions of these models for coding tasks.
+Tools like `ollama.nvim` for Neovim or platforms allowing you to run these models
+locally or on personal servers are becoming popular for those who prefer not to rely on
+cloud-based solutions.
+xAI: Grok
+Although primarily known for its conversational abilities and integration with X (formerly Twitter),
+Grok's capabilities in understanding and potentially generating code could be inferred from its
+general language processing skills, though specific coding features might be less documented.
+Each of these models or tools brings unique strengths to the table,
+from integration capabilities with existing workflows,
+support for a wide array of programming languages,
+to performance in generating high-quality, compilable code.
+The choice between them might depend on factors like integration with
+your current development environment, privacy concerns, cost, or specific coding task requirements.
+Remember, while these tools can significantly enhance productivity,
+they should be used as aids, with human oversight for critical or complex coding tasks.