Jekyll2022-11-20T18:35:35+00:00http://jokla.me/feed.xmlGiovanni Claudio - Personal WebsiteGiovanni Claudio (Senior Robotics Software Engineer) - Personal WebSiteGiovanni ClaudioTraffic Sign Recognition with Tensorflow2017-03-31T00:00:00+00:002017-03-31T00:00:00+00:00http://jokla.me/robotics/traffic-signs<h1 id="introduction">Introduction</h1>
<p>In this project, I used a convolutional neural network (CNN) to classify traffic signs. I trained and validated a model so it can classify traffic sign images using the <a href="http://benchmark.ini.rub.de/">German Traffic Sign Dataset</a>. After the model is trained, I tried out the model on images of traffic signs that I took with my smartphone camera.</p>
<p>My final model results are:</p>
<ul>
<li>Training set accuracy of 97.5%</li>
<li>Validation set accuracy of 98.5%</li>
<li>Test set accuracy of 97.2%</li>
<li>New Test set accuracy of 100% (6 new images taken by me)</li>
</ul>
<p>Here is the <a href="https://github.com/jokla/CarND-Traffic-Sign-Classifier-Project/blob/master/Traffic_Sign_Classifier.ipynb">project code</a>. Please note that I used only the CPU of my laptop to train the network.</p>
<p>The steps of this project are the following:</p>
<ul>
<li>Load the data set</li>
<li>Explore, summarize and visualize the data set</li>
<li>Design, train and test a model architecture</li>
<li>Use the model to make predictions on new images</li>
<li>Analyze the softmax probabilities of the new images</li>
<li>Summarize the results with a written report</li>
</ul>
<h1 id="data-set-summary--exploration">Data Set Summary & Exploration</h1>
<h2 id="1-basic-summary-of-the-data-set">1. Basic summary of the data set</h2>
<p>I used the Pandas library to calculate summary statistics of the traffic
signs data set:</p>
<ul>
<li>The size of original training set is 34799</li>
<li>The size of the validation set is 4410</li>
<li>The size of test set is 12630</li>
<li>The shape of a traffic sign image is 32x32x3 represented as integer values (0-255) in the RGB color space</li>
<li>The number of unique classes/labels in the data set is 43</li>
</ul>
<p>We have to work with images with a resolution of 32x32x3 representing 43 type of different German traffic signs.</p>
<h2 id="2-exploratory-visualization-of-the-dataset">2. Exploratory visualization of the dataset</h2>
<p>Here is an exploratory visualization of the data set. It is a bar chart showing how many samples we have for each class.</p>
<p><img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/dist_class_training.png" width="500" alt="Distribution Class Training" /></p>
<p>We can notice that the distribution is not balanced. We have some classes with less than 300 examples and other well represented with more than 1000 examples. We can analyze now the validation dataset distribution:</p>
<p><img src=".https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/dist_class_validation.png" width="500" alt="Distribution Class Validation" /></p>
<p>The distributions are very similar. Even if it would be wise to balance the dataset, in this case, I am not sure it would be very useful. In fact, some traffic signs (for example the 20km/h speed limit) could occur less frequently than others (the stop sign for example). For this reason I decided to not balance the dataset.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Class 0: Speed limit (20km/h) 180 samples
Class 1: Speed limit (30km/h) 1980 samples
Class 2: Speed limit (50km/h) 2010 samples
Class 3: Speed limit (60km/h) 1260 samples
Class 4: Speed limit (70km/h) 1770 samples
Class 5: Speed limit (80km/h) 1650 samples
Class 6: End of speed limit (80km/h) 360 samples
Class 7: Speed limit (100km/h) 1290 samples
Class 8: Speed limit (120km/h) 1260 samples
Class 9: No passing 1320 samples
Class 10: No passing for vehicles over 3.5 metric tons 1800 samples
Class 11: Right-of-way at the next intersection 1170 samples
Class 12: Priority road 1890 samples
Class 13: Yield 1920 samples
Class 14: Stop 690 samples
Class 15: No vehicles 540 samples
Class 16: Vehicles over 3.5 metric tons prohibited 360 samples
Class 17: No entry 990 samples
Class 18: General caution 1080 samples
Class 19: Dangerous curve to the left 180 samples
Class 20: Dangerous curve to the right 300 samples
Class 21: Double curve 270 samples
Class 22: Bumpy road 330 samples
Class 23: Slippery road 450 samples
Class 24: Road narrows on the right 240 samples
Class 25: Road work 1350 samples
Class 26: Traffic signals 540 samples
Class 27: Pedestrians 210 samples
Class 28: Children crossing 480 samples
Class 29: Bicycles crossing 240 samples
Class 30: Beware of ice/snow 390 samples
Class 31: Wild animals crossing 690 samples
Class 32: End of all speed and passing limits 210 samples
Class 33: Turn right ahead 599 samples
Class 34: Turn left ahead 360 samples
Class 35: Ahead only 1080 samples
Class 36: Go straight or right 330 samples
Class 37: Go straight or left 180 samples
Class 38: Keep right 1860 samples
Class 39: Keep left 270 samples
Class 40: Roundabout mandatory 300 samples
Class 41: End of no passing 210 samples
Class 42: End of no passing by vehicles over 3.5 metric tons 210 samples
</code></pre></div></div>
<h1 id="design-and-test-a-model-architecture">Design and Test a Model Architecture</h1>
<h2 id="1-pre-processing">1. Pre-processing</h2>
<p>This phase is crucial to improving the performance of the model. First of all, I decided to convert the RGB image into grayscale color. This allows to reduce the numbers of channels in the input of the network without decreasing the performance. In fact, as Pierre Sermanet and Yann LeCun mentioned in their paper <a href="http://yann.lecun.com/exdb/publis/pdf/sermanet-ijcnn-11.pdf">“Traffic Sign Recognition with Multi-Scale Convolutional Networks”</a>, using color channels did not seem to improve the classification accuracy. Also, to help the training phase, I normalized each image to have a range from 0 to 1 and translated to get zero mean. I also applied the <a href="https://en.wikipedia.org/wiki/Adaptive_histogram_equalization">Contrast Limited Adaptive Histogram Equalization</a> (CLAHE), an algorithm for local contrast enhancement, which uses histograms computed over different tile regions of the image. Local details can, therefore, be enhanced even in areas that are darker or lighter than most of the image. This should help the feature exaction.</p>
<p>Here the function I used to pre-process each image in the dataset:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">pre_processing_single_img</span> <span class="p">(</span><span class="n">img</span><span class="p">):</span>
<span class="n">img_y</span> <span class="o">=</span> <span class="n">cv2</span><span class="p">.</span><span class="n">cvtColor</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="p">(</span><span class="n">cv2</span><span class="p">.</span><span class="n">COLOR_BGR2YUV</span><span class="p">))[:,:,</span><span class="mi">0</span><span class="p">]</span>
<span class="n">img_y</span> <span class="o">=</span> <span class="p">(</span><span class="n">img_y</span> <span class="o">/</span> <span class="mf">255.</span><span class="p">).</span><span class="n">astype</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="n">float32</span><span class="p">)</span>
<span class="n">img_y</span> <span class="o">=</span> <span class="p">(</span><span class="n">exposure</span><span class="p">.</span><span class="n">equalize_adapthist</span><span class="p">(</span><span class="n">img_y</span><span class="p">,)</span> <span class="o">-</span> <span class="mf">0.5</span><span class="p">)</span>
<span class="n">img_y</span> <span class="o">=</span> <span class="n">img_y</span><span class="p">.</span><span class="n">reshape</span><span class="p">(</span><span class="n">img_y</span><span class="p">.</span><span class="n">shape</span> <span class="o">+</span> <span class="p">(</span><span class="mi">1</span><span class="p">,))</span>
<span class="k">return</span> <span class="n">img_y</span>
</code></pre></div></div>
<p>Steps:</p>
<ul>
<li>
<p>Convert the image to <a href="https://en.wikipedia.org/wiki/YUV">YUV</a> and extract Y Channel that correspond to the grayscale image:<br />
<code class="language-plaintext highlighter-rouge">img_y = cv2.cvtColor(img, (cv2.COLOR_BGR2YUV))[:,:,0]</code>
Y stands for the luma component (the brightness) and U and V are the chrominance (color) components.</p>
</li>
<li>
<p>Normalize the image to have a range from 0 to 1: <br />
` img_y = (img_y / 255.).astype(np.float32) `</p>
</li>
<li>
<p>Contrast Limited Adaptive Histogram Equalization (see <a href="http://scikit-image.org/docs/dev/api/skimage.exposure.html#skimage.exposure.equalize_adapthist">here</a> for more information) and translate the result to have mean zero: <br />
<code class="language-plaintext highlighter-rouge">img_y = (exposure.equalize_adapthist(img_y,) - 0.5)</code></p>
</li>
<li>
<p>Finally reshape the image from (32x32) to (32x32x1), the format required by tensorflow: <br />
<code class="language-plaintext highlighter-rouge">img_y = img_y.reshape(img_y.shape + (1,))</code></p>
</li>
</ul>
<p>Here is an example of a traffic sign image before and after the processing:</p>
<p><img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/original_samples3.png" width="360" /> <img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/prepro_train.png" width="360" /></p>
<p>Initially, I used <code class="language-plaintext highlighter-rouge">.exposure.adjust_log</code>, that it is quite fast but finally I decided to use <code class="language-plaintext highlighter-rouge">exposure.equalize_adapthist</code>, that gives a better accuracy.</p>
<h2 id="2-augmentation">2. Augmentation</h2>
<p>To add more data to the data set, I created two new datasets starting from the original training dataset, composed by 34799 examples. In this way, I obtain 34799x3 = 104397 samples in the training dataset.</p>
<h2 id="keras-imagedatagenerator">Keras ImageDataGenerator</h2>
<p>I used the Keras function <a href="https://keras.io/preprocessing/image/">ImageDataGenerator</a> to generate new images with the following settings:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="n">datagen</span> <span class="o">=</span> <span class="n">ImageDataGenerator</span><span class="p">(</span>
<span class="n">rotation_range</span><span class="o">=</span><span class="mi">17</span><span class="p">,</span>
<span class="n">width_shift_range</span><span class="o">=</span><span class="mf">0.1</span><span class="p">,</span>
<span class="n">height_shift_range</span><span class="o">=</span><span class="mf">0.1</span><span class="p">,</span>
<span class="n">shear_range</span><span class="o">=</span><span class="mf">0.3</span><span class="p">,</span>
<span class="n">zoom_range</span><span class="o">=</span><span class="mf">0.15</span><span class="p">,</span>
<span class="n">horizontal_flip</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span>
<span class="n">dim_ordering</span><span class="o">=</span><span class="s">'tf'</span><span class="p">,</span>
<span class="n">fill_mode</span><span class="o">=</span><span class="s">'nearest'</span><span class="p">)</span>
<span class="c1"># configure batch size and retrieve one batch of images
</span>
<span class="k">for</span> <span class="n">X_batch</span><span class="p">,</span> <span class="n">y_batch</span> <span class="ow">in</span> <span class="n">datagen</span><span class="p">.</span><span class="n">flow</span><span class="p">(</span><span class="n">X_train</span><span class="p">,</span> <span class="n">y_train</span><span class="p">,</span> <span class="n">batch_size</span><span class="o">=</span><span class="n">X_train</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">shuffle</span><span class="o">=</span><span class="bp">False</span><span class="p">):</span>
<span class="k">print</span><span class="p">(</span><span class="n">X_batch</span><span class="p">.</span><span class="n">shape</span><span class="p">)</span>
<span class="n">X_train_aug</span> <span class="o">=</span> <span class="n">X_batch</span><span class="p">.</span><span class="n">astype</span><span class="p">(</span><span class="s">'uint8'</span><span class="p">)</span>
<span class="n">y_train_aug</span> <span class="o">=</span> <span class="n">y_batch</span>
<span class="k">break</span>
</code></pre></div></div>
<p>To each picture in the training dataset, a rotation, a translation, a zoom and a shear transformation is applied.</p>
<p>Here is an example of an original image and an augmented image:</p>
<p><img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/original_samples.png" width="360" /> <img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/keras_prepro_samples.png" width="360" /></p>
<h2 id="motion-blur">Motion Blur</h2>
<p>Motion blur is the apparent streaking of rapidly moving objects in a still image. I thought it is a good idea add motion blur to the image since they are taken from a camera placed on a moving car.</p>
<p><img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/original_samples1.png" width="360" /> <img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/mb_prepro_samples.png" width="360" /></p>
<h2 id="3-final-model-architecture">3. Final model architecture</h2>
<p>I started from the LeNet network and I modified it using the multi-scale features took inspiration from the model presented in <a href="http://yann.lecun.com/exdb/publis/pdf/sermanet-ijcnn-11.pdf">Pierre Sermanet and Yann LeCun</a> paper. Finally, I increased the number of filters used in the first two convolutions.
We have in total 3 layers: 2 convolutional layers for feature extraction and one fully connected layer used. Note that my network has one convolutional layer less than the [Pierre Sermanet and Yann LeCun (http://yann.lecun.com/exdb/publis/pdf/sermanet-ijcnn-11.pdf) version.</p>
<table>
<thead>
<tr>
<th style="text-align: center">Layer</th>
<th style="text-align: center">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: center">Input</td>
<td style="text-align: center">32x32x1 Grayscale image</td>
</tr>
<tr>
<td style="text-align: center">Convolution 3x3</td>
<td style="text-align: center">1x1 stride, same padding, outputs 28x28x12</td>
</tr>
<tr>
<td style="text-align: center">RELU</td>
<td style="text-align: center"> </td>
</tr>
<tr>
<td style="text-align: center">Max pooling</td>
<td style="text-align: center">2x2 stride, outputs 14x14x12</td>
</tr>
<tr>
<td style="text-align: center">Dropout (a)</td>
<td style="text-align: center">0.7</td>
</tr>
<tr>
<td style="text-align: center">Convolution 3x3</td>
<td style="text-align: center">1x1 stride, same padding, outputs 10x10x24</td>
</tr>
<tr>
<td style="text-align: center">RELU</td>
<td style="text-align: center"> </td>
</tr>
<tr>
<td style="text-align: center">Max pooling</td>
<td style="text-align: center">2x2 stride, output = 5x5x24.</td>
</tr>
<tr>
<td style="text-align: center">Dropout (b)</td>
<td style="text-align: center">0.6</td>
</tr>
<tr>
<td style="text-align: center">Fully connected</td>
<td style="text-align: center">max_pool(a) + (b) flattend. input = 1188. Output = 320</td>
</tr>
<tr>
<td style="text-align: center">Dropout (c)</td>
<td style="text-align: center">0.5</td>
</tr>
<tr>
<td style="text-align: center">Fully connected</td>
<td style="text-align: center">Input = 320. Output = n_classes</td>
</tr>
<tr>
<td style="text-align: center">Softmax</td>
<td style="text-align: center"> </td>
</tr>
</tbody>
</table>
<p>To train the model I used 20 epochs with a batch size of 128, the <a href="https://www.tensorflow.org/api_docs/python/tf/train/AdamOptimizer">AdamOptimizer</a>(see paper <a href="https://arxiv.org/pdf/1412.6980v8.pdf">here</a>) with a learning rate of 0.001. The training phase is quite slow using only CPU, that’s why I used only 20 epochs.</p>
<p>My final model results were:</p>
<ul>
<li>Training set accuracy of 97.5%</li>
<li>Validation set accuracy of 98.5%</li>
<li>Test set accuracy of 97.2%</li>
</ul>
<h3 id="first-attempt-validation-accuracy-915">First attempt: validation accuracy 91.5%</h3>
<p>Initially, I started with the <a href="http://yann.lecun.com/exdb/lenet/">LeNet architecture</a>, a convolutional network designed for handwritten and machine-printed character recognition.</p>
<p><img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/lenet5.png" width="900" /></p>
<p>I used the the following preprocess pipeline:</p>
<ul>
<li>Convert in YUV, keep the Y</li>
<li>Adjust the exposure</li>
<li>Normalization</li>
</ul>
<p>Parameters:</p>
<ul>
<li>EPOCHS = 10</li>
<li>BATCH_SIZE = 128</li>
<li>Learning rate = 0.001</li>
</ul>
<p>Number of training examples = 34799 <br />
Number of validation examples = 4410 <br />
Number of testing examples = 12630</p>
<p>At each step, I will mention only the changes I adopted to improve the accuracy.</p>
<h3 id="second-attempt-validation-accuracy-931">Second attempt: validation accuracy 93.1%</h3>
<p>I added Dropout after each layer of the network LeNet: <br />
1) <code class="language-plaintext highlighter-rouge">0.9</code> (after C1) <br />
2) <code class="language-plaintext highlighter-rouge">0.7</code> (after C3) <br />
3) <code class="language-plaintext highlighter-rouge">0.6</code> (after C5) <br />
4) <code class="language-plaintext highlighter-rouge">0.5</code> (after F6)</p>
<h3 id="third-attempt-validation-accuracy-933">Third attempt: validation accuracy 93.3%</h3>
<p>I changed the network using multi-scale features as suggested in the paper <a href="https://www.google.fr/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0ahUKEwi079aWzOjSAhWHJ8AKHUx_ARkQFggdMAA&url=http%3A%2F%2Fyann.lecun.org%2Fexdb%2Fpublis%2Fpsgz%2Fsermanet-ijcnn-11.ps.gz&usg=AFQjCNGTHlNOHKmIxaKYw3_h-VYrsgpCag&sig2=llvR7_9QizK3hkAgkmUKTw">Traffic Sign Recognition with Multi-Scale Convolutional Networks</a> and use only one fully connected layer at the end of the network.</p>
<p><img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/net.png" width="800" /></p>
<h3 id="fourth-attempt-validation-accuracy-946">Fourth attempt: validation accuracy 94.6%</h3>
<p>I augmented the training set using the Keras function <a href="https://keras.io/preprocessing/image/">ImageDataGenerator</a>. In this way, I double the training set.</p>
<p>Number of training examples = 34799x2 = 69598 <br />
Number of validation examples = 4410</p>
<p>I Used Dropout with the follow probability (referred <a href="https://github.com/jokla/CarND-Traffic-Sign-Classifier-Project/blob/master/writeup.md#3-final-model-architecture">this table</a>): <br />
a) <code class="language-plaintext highlighter-rouge">0.8</code> <br />
b) <code class="language-plaintext highlighter-rouge">0.7</code> <br />
c) <code class="language-plaintext highlighter-rouge">0.6 </code></p>
<h3 id="fifth-attempt-validation-accuracy-961">Fifth attempt: validation accuracy 96.1%</h3>
<p>Since the training accuracy was not very high, I decided to increase the number of filters in the first two convolutional layers. <br />
First layer: from 6 to 12 filters. <br />
Second layer: from 16 to 24 filters.</p>
<h3 id="final-attempt-validation-accuracy-985">Final attempt: validation accuracy 98.5%</h3>
<p>I augmented the data adding the motion blur to each sample of the training data. Hence, I triplicate the number of samples in the training set. In addition, I added the L2 regularization and I used the function <code class="language-plaintext highlighter-rouge">equalize_adapthist</code> instead of <code class="language-plaintext highlighter-rouge">.exposure.adjust_log</code> during the image preprocessing.</p>
<h2 id="performance-on-the-test-set">Performance on the test set</h2>
<p>Finally I evaluated the performance of my model with the test set.</p>
<h3 id="accuracy">Accuracy</h3>
<p>The accuracy was equal to 97.2%.</p>
<h3 id="precision">Precision</h3>
<p>The Precision was equal to 96.6% <br />
The precision is the ratio <code class="language-plaintext highlighter-rouge">tp / (tp + fp)</code> where <code class="language-plaintext highlighter-rouge">tp</code> is the number of true positives and <code class="language-plaintext highlighter-rouge">fp</code> the number of false positives. The precision is intuitively the ability of the classifier not to label as positive a sample that is negative.</p>
<h3 id="recall">Recall</h3>
<p>The recall was equal to 97.2%
The recall is the ratio <code class="language-plaintext highlighter-rouge">tp / (tp + fn)</code> where <code class="language-plaintext highlighter-rouge">tp</code> is the number of true positives and <code class="language-plaintext highlighter-rouge">fn</code> the number of false negatives. The recall is intuitively the ability of the classifier to find all the positive samples.</p>
<h3 id="confusion-matrix">Confusion matrix</h3>
<p>Let’s analyze the <a href="https://en.wikipedia.org/wiki/Confusion_matrix">confusion matrix</a>:</p>
<div>
<a href="https://plot.ly/~jokla/1/?share_key=DmJjGBAv9EQXNjMc5jDWCT" target="_blank" title="Plot 1" style="display: block; text-align: center;"><img src="https://plot.ly/~jokla/1.png?share_key=DmJjGBAv9EQXNjMc5jDWCT" alt="Plot 1" style="max-width: 100%;width: 600px;" width="900" onerror="this.onerror=null;this.src='https://plot.ly/404.png';" /></a>
</div>
<p>You can click on the picture to interact with the map.</p>
<p>We can notice that:</p>
<ul>
<li>28/60 samples of to the class 19 (Dangerous curve to the left) are misclassified as samples belonging to the class 23 (Slippery road). This can be explained by the fact that the class number 19 is underrepresented in the training set: it has only 180 samples.</li>
<li>34/630 samples of to the class 5 (Speed limit 80km/h) are misclassified as samples belonging to the class 2 (Speed limit 50km/h).</li>
<li>The model is not very good to classify the class number 30 (Beware of ice/snow), it classified samples in a wrong way 61 times.</li>
<li>The model produces 80 false positives for the class 23.</li>
</ul>
<h1 id="test-a-model-on-new-images">Test a Model on New Images</h1>
<p>Here are five traffic signs from some pictures I took in France with my smartphone: <br />
<img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/11_Rightofway.jpg" width="100" /> <img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/25_RoadWork.jpg" width="100" /> <img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/14_Stop.jpg" width="100" /> <br />
<img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/17_Noentry.jpg" width="100" /> <img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/12_PriorityRoad.jpg" width="100" /> <img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/33_RightOnly.jpg" width="100" /></p>
<p>Here are the results of the prediction:</p>
<ul>
<li>the first image is the test image</li>
<li>the second one is a random picture of the same class of the prediction</li>
<li>the third one is a plot showing the top five soft max probabilities</li>
</ul>
<p><img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/new_sign1.png" width="480" />
<img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/new_sign2.png" width="480" />
<img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/new_sign3.png" width="480" />
<img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/new_sign4.png" width="480" />
<img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/new_sign5.png" width="480" />
<img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/new_sign6.png" width="480" /></p>
<p>The model was able to correctly guess 6 of the 6 traffic signs, which gives an accuracy of 100%. Nice!</p>
<h1 id="visualize-the-neural-networks-state-with-test-images">Visualize the Neural Network’s State with Test Images</h1>
<p>We can understand what the weights of a neural network look like better by plotting their feature maps. After successfully training your neural network you can see what it’s feature maps look like by plotting the output of the network’s weight layers in response to a test stimuli image. From these plotted feature maps, it’s possible to see what characteristics of an image the network finds interesting. For a sign, maybe the inner network feature maps react with high activation to the sign’s boundary outline or the contrast in the sign’s painted symbol.</p>
<p>Here the output of the first convolution layer:<br />
<img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/visualize1.png" width="700" /> <br />
Here the output of the second convolution layer:<br />
<img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/visualize2.png" width="700" /></p>
<p>We can notice that the CNN learned to detect useful features on its own. We can see in the first picture some edges of the sign for example.</p>
<p>Now another example using a test picture with no sign on it:</p>
<p><img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/no_sign.png" width="100" /></p>
<p>In this case the CNN does not recognize any useful features. The activations of the first feature map appear to contain mostly noise:</p>
<p><img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-31-traffic-signs/visualize1_nosign.png" width="700" /></p>
<h1 id="final-considerations">Final considerations</h1>
<h2 id="premise">Premise:</h2>
<p>Only CPU was used to train the network. I had to use only the CPU of my laptop because I didn’t have any good GPU at my disposal. I choose to not use any online service like AWS or FloydHub, mostly because I was waiting for the arrival of the GTX 1080. Unfortunately, it did not arrive in time for this project. This required me to use a small network and to keep the number of epochs around 20.</p>
<h2 id="some-possible-improvements">Some possible improvements:</h2>
<ul>
<li>I would use Keras to define the network and its function ImageDataGenerator to generate augmented samples on the fly. Using more data could improve the performance of the model. In my case, I have generated an augmented dataset once, saved it on the disk and used it every time to train. It would be useful to generate randomly the dataset each time before the training.</li>
<li>The confusion matrix gives us suggestions to improve the model (see section <code class="language-plaintext highlighter-rouge">Confusion matrix</code>). There are some classes with low precision or recall. It would be useful to try to add more data for these classes. For example, I would generate new samples for the class 19 (Dangerous curve to the left) since it has only 180 samples and the model.</li>
<li>The accuracy for the training set is 0.975. This means that the model is probably underfitting a little bit. I tried to make a deeper network (adding more layers) and increasing the number of filters but it was too slow to train it using the CPU only.</li>
<li>The model worked well with new images taken with my camera (100% of accuracy). It would be useful to test the model by using more complicated examples.</li>
</ul>Giovanni ClaudioDesigned a CNN inspired by LeNet architectureFinding Lane Lines on the Road2017-03-10T00:00:00+00:002017-03-10T00:00:00+00:00http://jokla.me/robotics/lane-detection<h1 id="overview">Overview:</h1>
<p>The goal of this project is to make a pipeline that finds lane lines on the road using Python and OpenCV. See an example:</p>
<p><img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-10-lane-detection/solidWhiteRight.jpg" width="360" alt="Combined Image" /> <img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-10-lane-detection/laneLines_thirdPass.jpg" width="360" alt="Combined Image" /></p>
<p>The pipeline will be tested on some images and videos provided by Udacity. The following assumptions are made:</p>
<ul>
<li>The camera always has the same position with respect to the road</li>
<li>There is always a visible white or yellow line on the road</li>
<li>We don’t have any vehicle in front of us</li>
<li>We consider highway scenario with good weather conditions</li>
</ul>
<p><a href="https://github.com/jokla/CarND-LaneLines-P1">Here</a> you can find the project.</p>
<hr />
<h1 id="reflection">Reflection</h1>
<h2 id="1-pipeline-description">1. Pipeline description</h2>
<p>I will use the following picture to show you all the steps:</p>
<p><img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-10-lane-detection/original.png" width="360" alt="Combined Image" /></p>
<h3 id="color-selection">Color selection</h3>
<p>Firstly, I applied a color filtering to suppress non-yellow and non-white colors. The pixels that were above the thresholds have been retained, and pixels below the threshold have been blacked out. This is the result:</p>
<p><img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-10-lane-detection/mask_color.png" width="360" alt="Combined Image" /></p>
<p>I will keep aside this mask and use it later.</p>
<h3 id="convert-the-color-image-in-grayscale">Convert the color image in grayscale</h3>
<p>The original image is converted in grayscale. In this way we have only one channel:</p>
<p><img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-10-lane-detection/gray.png" width="360" alt="Combined Image" /></p>
<h4 id="use-canny-for-edge-detection">Use Canny for edge detection</h4>
<p>Before running the <a href="http://docs.opencv.org/2.4/doc/tutorials/imgproc/imgtrans/canny_detector/canny_detector.html">Canny detector</a>, I applied a <a href="http://docs.opencv.org/2.4/modules/imgproc/doc/filtering.html?highlight=gaussianblur#gaussianblur">Gaussian smoothing</a> which is essentially a way of suppressing noise and spurious gradients by averaging. The Canny allows detecting the edges in the images. To improve the result, I also used the OpenCV function <code class="language-plaintext highlighter-rouge">dilate</code> and <code class="language-plaintext highlighter-rouge">erode</code>.</p>
<p><img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-10-lane-detection/canny.png" width="360" alt="Combined Image" /></p>
<h3 id="merge-canny-and-color-selection">Merge Canny and Color Selection</h3>
<p>In some cases, the Canny edge detector fails to find the lines. For example, when there is not enough contrast between the asphalt and the line, as in the challenge video (see section <code class="language-plaintext highlighter-rouge">Optional challenge</code>). The color selection, on the other hand, doesn’t have this problem. For this reason, I decided to merge the result of the Canny detector and the color selection:</p>
<p><img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-10-lane-detection/merge.png" width="360" alt="Combined Image" /></p>
<h3 id="region-of-interest-mask">Region Of Interest Mask</h3>
<p>I defined a left and right trapezoidal Region Of Interest (ROI) based on the image size. Since that the front facing camera is mounted in a fix position, we supposed here that the lane lines will always appear in the same region of the image.</p>
<p><img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-10-lane-detection/roi.png" width="360" alt="Combined Image" /></p>
<h3 id="run-hough-transform-to-detect-lines">Run Hough transform to detect lines</h3>
<p>The Hough transform is used to detect lines in the images. At this step, I applied a slope filter to get rid of horizontal lines. This is the result: <br />
<img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-10-lane-detection/hough.png" width="360" alt="Combined Image" /></p>
<h3 id="compute-lines">Compute lines</h3>
<p>Now I need to average/extrapolate the result of the Hough transform and draw the two lines onto the image. I used the function <a href="http://docs.opencv.org/2.4/modules/imgproc/doc/structural_analysis_and_shape_descriptors.html#fitline"><code class="language-plaintext highlighter-rouge">fitLine</code></a>, after having extrapolated the points from the Hough tranform result with the OpenCV function <code class="language-plaintext highlighter-rouge">findNonZero</code>. I did this two times, once for the right line and another time for the left line. As a result, I got the slopes of the lines, and I could draw them onto the original picture:</p>
<p><img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-10-lane-detection/final.png" width="360" alt="Combined Image" /></p>
<h1 id="results">Results:</h1>
<h2 id="pictures">Pictures</h2>
<p>Here some results on test images provided by Udacity: <br />
<img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-10-lane-detection/final.png" width="360" alt="Combined Image" /> <img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-10-lane-detection/result1.png" width="360" alt="Combined Image" /> <br />
<img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-10-lane-detection/result2.png" width="360" alt="Combined Image" /> <img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-10-lane-detection/result3.png" width="360" alt="Combined Image" /> <br />
<img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-10-lane-detection/result4.png" width="360" alt="Combined Image" /> <img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-10-lane-detection/result5.png" width="360" alt="Combined Image" /></p>
<p>You can find the original pictures and the results in the folder <code class="language-plaintext highlighter-rouge">test_images</code>.</p>
<h2 id="videos">Videos</h2>
<p>Here some results on test videos provided by Udacity:</p>
<p><img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-10-lane-detection/white.gif" alt="" />
<img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-10-lane-detection/yellow.gif" alt="" /></p>
<p>You can find the video files here: <a href="https://github.com/jokla/CarND-LaneLines-P1/blob/master/yellow.mp4">video1</a>, <a href="https://github.com/jokla/CarND-LaneLines-P1/blob/master/white.mp4">video2</a>.</p>
<h3 id="optional-challenge">Optional challenge:</h3>
<p>While I got a satisfactory result on the first two videos provided by Udacity, it was not the case for the challenge video. In the challenge video we can identify more difficulties:</p>
<ul>
<li>The color of the asphalt became lighter at a certain point. The Canny edge detector is not able to find the line using the grayscale image (where we lose information about the color)</li>
<li>The car is driving on a curving road</li>
<li>There are some shadows due to some trees</li>
</ul>
<p>To overcome theses problems, I introduced the color mask and resized the ROI. This is the result, using only the color mask (without the canny detection):</p>
<p><img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-10-lane-detection/extra.gif" alt="" /></p>
<p>You can find the video file here: <a href="https://github.com/jokla/CarND-LaneLines-P1/blob/master/extra.mp4">video_challenge</a></p>
<p>The right line is a little jumpy mainly because of the curve: the function <code class="language-plaintext highlighter-rouge">fitline</code> is trying to fit a line on a curvy lane. It would be useful to shrink the ROI in this case, but I preferred to keep the same ROI size used in the first two videos.</p>
<p>If we analyze the steps using a snapshot from the challenge video, we can notice that the Canny detector is not very useful:</p>
<p><img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-10-lane-detection/original_challenge.png" width="360" alt="Combined Image" /> <img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-10-lane-detection/canny_challenge.png" width="360" alt="Combined Image" /></p>
<p>while the color mask is able to detect the lines:</p>
<p><img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-10-lane-detection/color_challenge.png" width="360" alt="Combined Image" /></p>
<p>Indeed, as you can see in the following picture, we lose valuable color information when we convert the image in grayscale. Moreover, the Canny operator find a lot of edges when we have shadows on the road.</p>
<p><img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-10-lane-detection/gray_challenge.png" width="360" alt="Combined Image" /></p>
<h3 id="testing-the-pipeline-on-a-youtube-video">Testing the pipeline on a YouTube Video:</h3>
<p>Just out of curiosity, I wanted to test the pipeline on a video extracted from Youtube (see the original video <a href="https://www.youtube.com/watch?v=jwBaGY67olI">here</a> ).</p>
<p>I noticed that the color selection was not working properly in this case, so I had to tune a little bit the thresholds values. This is the new result using both color selection and Canny:</p>
<p><img src="https://raw.githubusercontent.com/jokla/jokla.github.io/master/images/post/2017-03-10-lane-detection/extra_test.gif" alt="" /></p>
<p>You can find the video file here: <a href="https://github.com/jokla/CarND-LaneLines-P1/blob/master/extra_test_result.mp4">video_extra</a></p>
<p>It would be wiser to transform the image in the HSV space and to apply the color selection, instead of doing it on the RGB images.</p>
<h2 id="2-potential-shortcomings-with-the-current-pipeline">2. Potential shortcomings with the current pipeline</h2>
<ul>
<li>This approach could not work properly:
<ul>
<li>if the camera is placed at a different position</li>
<li>if other vehicles in front are occluding the view</li>
<li>if one or more lines are missing</li>
<li>at different weather and light condition (fog, rain, or at night)</li>
</ul>
</li>
</ul>
<h2 id="3-possible-improvements">3. Possible improvements</h2>
<p>Some possible improvements:</p>
<ul>
<li>Perform a color selection in the HSV space, instead of doing it in the RGB images</li>
<li>Update the ROI mask dynamically</li>
<li>Perform a segmentation of the road</li>
<li>Using a better filter to smooth the current estimation, using the previous ones</li>
<li>If a line is not detected, we could estimate the current slope using the previous estimations and/or the other line detection</li>
<li>Use a moving-edges tracker for the continuous lines</li>
</ul>Giovanni ClaudioFirst project of the Udacity Self Driving Car NanoDegreeInstall Unity3D in Ubuntu2017-02-05T00:00:00+00:002017-02-05T00:00:00+00:00http://jokla.me/self-driving-car/install-unity3d-ubuntu<p>I wanted to test on my laptop this nice project created with Unity3D by <a href="https://github.com/tawnkramer">tawnkramer</a>:</p>
<iframe width="640" height="360" src="https://www.youtube.com/embed/e0AFMilaeMI" frameborder="0" allowfullscreen=""></iframe>
<p>Unity3D is used to simulate a self-driving car. A neural network based on the Nvidia’s paper <a href="https://images.nvidia.com/content/tegra/automotive/images/2016/solutions/pdf/end-to-end-dl-using-px.pdf">“End to End Learning for Self-Driving Cars”</a> is trained to drive the car down a randomly generated road. You can find the source code <a href="https://github.com/tawnkramer/sdsandbox">here</a>.</p>
<h2 id="download-unity-for-linux">Download Unity for Linux</h2>
<p>Donwload the last version <a href="https://forum.unity3d.com/threads/unity-on-linux-release-notes-and-known-issues.350256/">here</a>. I downloaded the debian package: <a href="http://beta.unity3d.com/download/35e1927e3b6b/public_download.html">Unity 5.6.0xb3Linux</a>.</p>
<h2 id="installation">Installation</h2>
<p>You can install the .deb package via the Ubuntu Software Center and is expected to work on installations of Ubuntu 12.04 or newer.</p>
<h2 id="fix-black-screen-launching-the-application">Fix black screen launching the application</h2>
<p>Initially Unity3D was not able to start properly, I was only getting a grey screen. <a href="https://forum.unity3d.com/threads/dark-grey-screen-fix.448936/">This solution</a>, posted by malyzeli, solved the problem.</p>Giovanni ClaudioTested in Ubuntu 14.04How to compute frame transformations with Tf (ROS)2016-08-08T00:00:00+00:002016-08-08T00:00:00+00:00http://jokla.me/robotics/lookup-transform-ros<p>Here a script to compute the transformation between two frames published in /tf in ROS.<br />
You can <a href="https://gist.github.com/jokla/2630f8dbbd4391d91ab28b2f3f76801a/raw/fbf02a7ec13c4d7f61e9aea86719f866f18f539b/computeTF2frames.py">download</a> the python script, modify the name of the topics and run it with<br />
<code class="language-plaintext highlighter-rouge">python computeTF2frames.py</code></p>
<p>The result:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Translation: (0.06290424180516997, -0.004345408291908189, 1.1515618071559173)
Rotation: (-0.5041371308002005, 0.5088056423602352, -0.49116582568647843, 0.4957002151791443)
</code></pre></div></div>
<script src="https://gist.github.com/2630f8dbbd4391d91ab28b2f3f76801a.js"> </script>Giovanni Claudiousing the function lookupTransformIntrinsic camera calibration for Nao/Romeo/Pepper with Visp2016-08-04T00:00:00+00:002016-08-04T00:00:00+00:00http://jokla.me/robotics/camera-calibration-visp<p>We will show here how to estimate the <a href="http://ksimek.github.io/2013/08/13/intrinsic/">camera intrinsic parameters</a> for the robot Nao, Romeo or Pepper, using the <a href="http://visp-doc.inria.fr/doxygen/visp-2.8.0/tutorial-calibration.html">ViSP camera calibration tool</a>.</p>
<p>First of all we need to have ‘ViSP’,<code class="language-plaintext highlighter-rouge">visp_naoqi</code> and the C++ SDK from Softbank. You can follow <a href="http://jokla.me/robotics/visp_naoqi/">this guide</a>.</p>
<p>Once everything is working we can run the program to estimate the parameters:</p>
<ul>
<li>Go to the build folder of <code class="language-plaintext highlighter-rouge">visp_naoqi</code> via terminal</li>
<li>
<p>Run the program <code class="language-plaintext highlighter-rouge">camera_calibration</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Usage: ./sdk/bin/camera_calibration [ --config <configuration file>.cfg] [--ip <robot address>] [--port <port robot>] [--cam camera_number] [--name camera_name] [--vga] [--help]
</code></pre></div> </div>
<p>Here the explanation of the options:</p>
<ul>
<li>[ <code class="language-plaintext highlighter-rouge">--config <configuration file>.cfg</code>] The path to a configuration file where we define the kind of pattern we are using ( size of the grid and the dimension of the circle/square). You can find two examples here:(default-chessboard.cfg)[https://github.com/lagadic/visp_naoqi/blob/master/tools/calibration/default-chessboard.cfg] or (default-circles.cfg)[https://github.com/lagadic/visp_naoqi/blob/master/tools/calibration/default-circles.cfg]</li>
<li>[<code class="language-plaintext highlighter-rouge">--ip <robot address></code>] Se the IP of the robot.</li>
<li>[<code class="language-plaintext highlighter-rouge">--port <port robot></code>] Se the port of the robot: default 9559.</li>
<li>[<code class="language-plaintext highlighter-rouge">--cam camera_number</code>] Choose the camera you want to use. For Pepper and Nao 0 = TopCamera, 1 = BottomCamera.</li>
<li>[<code class="language-plaintext highlighter-rouge">--name camera_name</code>] Set the name of the camera.</li>
<li>[<code class="language-plaintext highlighter-rouge">--vga</code>] if you want to set the camera at the resolution of 640x480. Default resolution: 320x240<br />
Example:</li>
</ul>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ ./sdk/bin/camera_calibration --config /udd/gclaudio/romeo/cpp/workspace/visp_naoqi/tools/calibration/default-circles.cfg --cam 0 --name cameraTopPepper --ip 131.254.10.126
</code></pre></div> </div>
</li>
</ul>Giovanni ClaudioIntrinsic camera calibration for Nao/Romeo/Pepper with Visp.Speech recognition using ROS and Pocketsphinx2016-03-28T00:00:00+00:002016-03-28T00:00:00+00:00http://jokla.me/robotics/speech-recognition-ros<p>How to install pocketsphinx on ROS Indigo (Ubuntu 14.04).</p>
<aside class="sidebar__right">
<nav class="toc">
<header><h4 class="nav__title"><i class="fa fa-file-text"></i> table of contents</h4></header>
<ul class="toc__menu" id="markdown-toc">
<li><a href="#install-dependecies" id="markdown-toc-install-dependecies">Install Dependecies</a></li>
<li><a href="#clone-and-build-pocketsphinx" id="markdown-toc-clone-and-build-pocketsphinx">Clone and Build Pocketsphinx</a></li>
<li><a href="#test-pocketsphinx" id="markdown-toc-test-pocketsphinx">Test Pocketsphinx</a></li>
<li><a href="#command-turtlebot-robot-in-gazebo-using-voice-commands" id="markdown-toc-command-turtlebot-robot-in-gazebo-using-voice-commands">Command Turtlebot robot in Gazebo using voice commands</a></li>
</ul>
</nav>
</aside>
<h2 id="install-dependecies">Install Dependecies</h2>
<ul>
<li><code class="language-plaintext highlighter-rouge">$ sudo apt-get install gstreamer0.10-pocketsphinx</code></li>
<li><code class="language-plaintext highlighter-rouge">$ sudo apt-get install python-gst0.10</code></li>
<li><code class="language-plaintext highlighter-rouge">$ sudo apt-get install gstreamer0.10-gconf</code></li>
</ul>
<h2 id="clone-and-build-pocketsphinx">Clone and Build Pocketsphinx</h2>
<ul>
<li>Clone the repository in your catkin src folder:
<ul>
<li><code class="language-plaintext highlighter-rouge">$ git clone https://github.com/mikeferguson/pocketsphinx.git</code></li>
</ul>
</li>
<li>Launch the catkin_make command from the catkin workspace folder</li>
</ul>
<h2 id="test-pocketsphinx">Test Pocketsphinx</h2>
<ul>
<li>Now we can start the speech recognizer:
<ul>
<li><code class="language-plaintext highlighter-rouge">$ roslaunch pocketsphinx turtlebot_voice_cmd.launch</code></li>
</ul>
</li>
<li>These are the basic commands can be recognized (see file voice_cmd.corpus in the folder demo):</li>
</ul>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>forward
left
right
back
backward
stop
move forward
move right
move left
move back
move backward
halt
half speed
full speed
</code></pre></div></div>
<ul>
<li>Open another terminal to check if the words are recognized:</li>
<li><code class="language-plaintext highlighter-rouge">$rostopic echo /recognizer/output</code></li>
</ul>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>jokla@Dell-PC:~/catkin_ws/src$ rostopic echo /recognizer/output
data: back
---
data: speed
---
data: move right
---
data: move back
---
data: back
---
data: left
---
data: speed
---
data: left
---
data: move right
---
data: stop
---
data: move right
---
data: move left
---
data: full speed
</code></pre></div></div>
<ul>
<li>To each vocal command corresponds a twist command. For example, this is the twist corresponding to the command “back”:</li>
</ul>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>linear:
x: <span class="nt">-0</span>.4
y: 0.0
z: 0.0
angular:
x: 0.0
y: 0.0
z: 0.0<span class="sb">`</span>
</code></pre></div></div>
<h2 id="command-turtlebot-robot-in-gazebo-using-voice-commands">Command Turtlebot robot in Gazebo using voice commands</h2>
<ul>
<li>Install <a href="http://wiki.ros.org/turtlebot_simulator">turtlebot_simulator</a>
<ul>
<li><code class="language-plaintext highlighter-rouge">$ sudo apt-get install ros-indigo-turtlebot-simulator</code></li>
</ul>
</li>
<li>Open the launch file <code class="language-plaintext highlighter-rouge">turtlebot_voice_cmd</code> and remap the name of the topic cmd_vel to <code class="language-plaintext highlighter-rouge">/mobile_base/commands/velocity</code></li>
</ul>
<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt"><node</span> <span class="na">name=</span><span class="s">"voice_cmd_vel"</span> <span class="na">pkg=</span><span class="s">"pocketsphinx"</span> <span class="na">type=</span><span class="s">"voice_cmd_vel.py"</span> <span class="na">output=</span><span class="s">"screen"</span><span class="nt">></span>
<span class="nt"><remap</span> <span class="na">from=</span><span class="s">"cmd_vel"</span> <span class="na">to=</span><span class="s">"/mobile_base/commands/velocity"</span><span class="nt">/></span>
<span class="nt"></node></span>`
</code></pre></div></div>
<p>In this way the node pocketsphinx publishes the velocities in the topic <code class="language-plaintext highlighter-rouge">/mobile_base/commands/velocity</code> and gazebo subscribes to it.</p>
<ul>
<li>Launch simulation (click <a href="http://wiki.ros.org/turtlebot_gazebo/Tutorials/indigo/Gazebo%20Bringup%20Guide">here</a> for more info). Note that Gazebo may update its model database when it is started for the first time. This may take a few minutes.
<ul>
<li><code class="language-plaintext highlighter-rouge">$ roslaunch turtlebot_gazebo turtlebot_world.launch</code></li>
</ul>
</li>
<li>Now we can start the speech recognizer:
<ul>
<li><code class="language-plaintext highlighter-rouge">$ roslaunch pocketsphinx turtlebot_voice_cmd.launch</code></li>
</ul>
</li>
<li>You should be able to control Turtlebot using your voice.</li>
</ul>Giovanni ClaudioHow to install pocketsphinx on ROS Indigo (Ubuntu 14.04).Places to see in Brittany2014-12-29T00:00:00+00:002014-12-29T00:00:00+00:00http://jokla.me/travel/brittany-to-see<p>I have the fortune to live in one of the best regions of France: Brittany. During my weekend I love traveling around my current city, Rennes. Having said this, I want to share a map collecting the places that I want to see or I have already seen. I will try to improve this map adding comments and pictures.</p>
<iframe src="https://mapsengine.google.com/map/embed?mid=z_K4lDOSEk7c.kzlVNHXYev4g" width="640" height="480"></iframe>Giovanni ClaudioSome tips for holidays in Brittany (with map)Naoqi C++ SDK Installation2014-12-28T00:00:00+00:002014-12-28T00:00:00+00:00http://jokla.me/robotics/install-sdk-c-naoqi<aside class="sidebar__right">
<nav class="toc">
<header><h4 class="nav__title"><i class="fa fa-file-text"></i> table of contents</h4></header>
<ul class="toc__menu" id="markdown-toc">
<li><a href="#prerequisites" id="markdown-toc-prerequisites">Prerequisites</a> <ul>
<li><a href="#installation-ide" id="markdown-toc-installation-ide">Installation IDE</a></li>
<li><a href="#download-software" id="markdown-toc-download-software">Download software</a> <ul>
<li><a href="#c-sdk-and-cross-toolchain" id="markdown-toc-c-sdk-and-cross-toolchain">C++ SDK and Cross Toolchain</a></li>
</ul>
</li>
<li><a href="#creation-devtools-and-workspace-folders" id="markdown-toc-creation-devtools-and-workspace-folders">Creation Devtools and workspace folders</a> <ul>
<li><a href="#qibuild" id="markdown-toc-qibuild">Qibuild</a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#using-qibuild-with-aldebaran-c-sdks" id="markdown-toc-using-qibuild-with-aldebaran-c-sdks">Using qibuild with Aldebaran C++ SDKs</a> <ul>
<li><a href="#optional-test" id="markdown-toc-optional-test">Optional Test:</a></li>
</ul>
</li>
</ul>
</nav>
</aside>
<h2 id="prerequisites">Prerequisites</h2>
<h3 id="installation-ide">Installation IDE</h3>
<p>QT Creator is the IDE recommended by SoftBank Robotics.</p>
<ul>
<li>Download the installer available <a href="http://qt-project.org/downloads#qt-creator">here</a>. In my case the file is named <code class="language-plaintext highlighter-rouge">qt-opensource-linux-x64-1.6.0-5-online.run</code>.</li>
<li>Go in the folder where you downloaded the installer of qt-creator and give execute permission with:<br />
<code class="language-plaintext highlighter-rouge">$ chmod a+x qt-opensource-linux-x64-1.6.0-5-online.run</code></li>
<li>Run the installer:<br />
<code class="language-plaintext highlighter-rouge">$ ./qt-opensource-linux-x64-1.6.0-5-online.run</code></li>
</ul>
<h3 id="download-software">Download software</h3>
<h4 id="c-sdk-and-cross-toolchain">C++ SDK and Cross Toolchain</h4>
<p>Download the following packages <a href="https://community.aldebaran-robotics.com/resources/">here</a> or <a href="https://developer.softbankrobotics.com/us-en/downloads/pepper">here</a> for Pepper:</p>
<ul>
<li>C++ SDK 2.3 Linux 64 (or newier version)</li>
</ul>
<h3 id="creation-devtools-and-workspace-folders">Creation Devtools and workspace folders</h3>
<ul>
<li>Let’s create now some folders useful for the development with the SDK:<br />
<code class="language-plaintext highlighter-rouge">$ mkdir -p ~/romeo/{devtools,workspace} </code></li>
</ul>
<p>NB: This is just a suggestion, you can manage these folders as you prefer.</p>
<ul>
<li>Now we can extract the C++ SDK and Cross Toolchain in the devtools folder. Go via terminal in the folder where you downloaded the tools and run:<br />
<code class="language-plaintext highlighter-rouge">$ tar -zxvf naoqi-sdk-2.3.0.14-linux64.tar.gz -C ~/romeo/devtools/</code></li>
</ul>
<h4 id="qibuild">Qibuild</h4>
<ul>
<li>Open a terminal and install Qibuild with <a href="https://pip.pypa.io/en/latest/installing.html#install-pip">pip</a>:
<code class="language-plaintext highlighter-rouge">$ pip install qibuild</code></li>
<li>If you don’t have pip installed you can install it with:
<code class="language-plaintext highlighter-rouge">$ sudo apt-get install python-pip</code></li>
<li>Now we add the installation location of Qibuild in the PATH. Open the file bashrc: <code class="language-plaintext highlighter-rouge">$ gedit ~/.bashrc</code> and in the end of the file add:<br />
<code class="language-plaintext highlighter-rouge">export PATH=${PATH}:${HOME}/.local/bin</code></li>
<li>Open a new terminal and check if Qibuild is correctly installed:<br />
<code class="language-plaintext highlighter-rouge">$ qibuild --version</code></li>
<li>Now we have to create a qibuild “worktree”. This path will be the root from where qiBuild searches to find the sources of your projects. We can use the folder we created before: <code class="language-plaintext highlighter-rouge">~/romeo/workspace</code>.<br />
<code class="language-plaintext highlighter-rouge">$ cd ~/romeo/workspace</code>
And digit:<br />
<code class="language-plaintext highlighter-rouge">$ qibuild init</code></li>
<li>Now we can run:
<code class="language-plaintext highlighter-rouge">$ qibuild config --wizard</code> <br />
A file will be generated in ~/.config/qi/qibuild.xml. It is shared by all the worktrees you will create. You will be asked to choose a CMake generator, select Unix Makefiles, and to choose a IDE, choose QtCreator (or another if you use a different IDE).</li>
</ul>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>:: Please choose a generator:
> 1 (Unix Makefiles)
:: Please choose an IDE
> 2 (QtCreator)
:: Do you want to use qtcreator from /usr/bin/qtcreator?
> Y (Yes)
:: Found a worktree in /udd/fspindle/soft/romeo/workspace_gantry
:: Do you want to configure settings for this worktree? (y/N)
> y
:: Do you want to use a unique build dir? (mandatory when using Eclipse) (y/N)
> N
</code></pre></div></div>
<ul>
<li>If you see a message like “CMake not found” probably you have to install CMake:</li>
<li><code class="language-plaintext highlighter-rouge">sudo apt-get update && sudo apt-get install cmake </code></li>
<li>We can create, configure and build a new project called “foo”:
<code class="language-plaintext highlighter-rouge">$ qisrc create foo</code> <br />
New project initialized in /home/jokla/romeo/workspace/foo
<code class="language-plaintext highlighter-rouge">$ qibuild configure foo</code></li>
</ul>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Current build worktree: /home/jokla/romeo/workspace
Build type: Debug
* (1/1) Configuring foo
-- The C compiler identification is GNU 4.8.2
-- The CXX compiler identification is GNU 4.8.2
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Using qibuild 3.6.2
-- Binary: foo
-- Binary: test_foo
-- Configuring done
-- Generating done
-- Build files have been written to: /home/jokla/romeo/workspace/foo/build-sys-linux-x86_64
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">$ qibuild make foo</code></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Current build worktree: /home/jokla/romeo/workspace
Build type: Debug
* (1/1) Building foo
Scanning dependencies of target foo
[ 50%] Building CXX object CMakeFiles/foo.dir/main.cpp.o
Linking CXX executable sdk/bin/foo
[ 50%] Built target foo
Scanning dependencies of target test_foo
[100%] Building CXX object CMakeFiles/test_foo.dir/test.cpp.o
Linking CXX executable sdk/bin/test_foo
[100%] Built target test_foo
</code></pre></div></div>
<ul>
<li>We can run the executable of the project “foo”:<br />
<code class="language-plaintext highlighter-rouge">$ cd ~/romeo/workspace/foo/build-sys-linux-x86_64/sdk/bin/foo</code><br />
You should see:<br />
<code class="language-plaintext highlighter-rouge">Hello, world</code></li>
</ul>
<p>References: <a href="https://community.aldebaran-robotics.com/doc/1-14/dev/cpp/tutos/using_qibuild.html#cpp-tutos-using-qibuild">link1</a>, <a href="https://community.aldebaran-robotics.com/doc/qibuild/beginner/qibuild/aldebaran.html">link2</a></p>
<h2 id="using-qibuild-with-aldebaran-c-sdks">Using qibuild with Aldebaran C++ SDKs</h2>
<ul>
<li>Now we need to create a toolchain (change with your path to the file toolchain.xml you want to use. You will find it in the naoqi-sdk folder):</li>
</ul>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ qitoolchain create toolchain_romeo /home/jokla/romeo/devtools/naoqi-sdk-2.3.0.14-linux64/toolchain.xml --default
</code></pre></div></div>
<p>NB: Instead of <code class="language-plaintext highlighter-rouge">toolchain_romeo</code> you can choose the name that you want. You can create also different toolchains.</p>
<ul>
<li>If you have a new version of qibuild the procedure is slightly different:</li>
</ul>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ qitoolchain create toolchain_romeo /local/soft/naoqi-sdk/naoqi-sdk-2.3.0.14-linux64/toolchain.xml
$ qibuild add-config toolchain_romeo -t toolchain_romeo --defaul
</code></pre></div></div>
<h3 id="optional-test">Optional Test:</h3>
<ul>
<li>Open a terminal and digit:</li>
</ul>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ cd ~/romeo/devtools/naoqi-sdk-2.3.0.14-linux64/doc/dev/cpp/examples
$ qibuild init --interactive
</code></pre></div></div>
<ul>
<li>Now we can configure and build the examples:</li>
</ul>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ cd core/helloworld/
$ qibuild configure -c toolchain_romeo
$ qibuild make -c toolchain_romeo
</code></pre></div></div>
<ul>
<li>You can also build in release mode:</li>
</ul>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ qibuild configure --release <project_name>
$ qibuild make --release <project_name>
</code></pre></div></div>Giovanni ClaudioHow to install the SDK C++ for Pepper, Romeo or Nao.Matlab ROS Bridge2014-12-05T00:00:00+00:002014-12-05T00:00:00+00:00http://jokla.me/software/matlab_ros_bridge<p>This software consists of a set of Matlab C++ S-functions that can be used to:</p>
<ul>
<li>synchronize simulink with the system clock, thus obtaining a soft-real-time execution;</li>
<li>interface simulink blocks with other ROS nodes using ROS messages.</li>
</ul>
<p>This project is based on a work started by Martin Riedel and Riccardo Spica at the Max Plank Institute for Biological Cybernetics in Tuebingen (Germany).
This fork is currently supported by <a href="mailto:riccardo.spica@irisa.fr">Riccardo Spica</a> and <a href="mailto:giovanni.claudio@irisa.fr">Giovanni Claudio</a> at the Inria Istitute in Rennes (France).</p>
<p>The software is released under the BSD license. See the LICENSE file in this repository for more information.</p>Giovanni ClaudioA set of Matlab C++ S-functions that can be used to interface Simulink blocks with ROS.Romeo and ROS2014-10-29T00:00:00+00:002014-10-29T00:00:00+00:00http://jokla.me/robotics/ros-romeo<aside class="sidebar__right">
<nav class="toc">
<header><h4 class="nav__title"><i class="fa fa-file-text"></i> table of contents</h4></header>
<ul class="toc__menu" id="markdown-toc">
<li><a href="#list-of-packages" id="markdown-toc-list-of-packages">List of packages:</a> <ul>
<li><a href="#step-1-controller-romeo-and--joint-state-publisher" id="markdown-toc-step-1-controller-romeo-and--joint-state-publisher">Step 1: Controller Romeo and joint state publisher:</a></li>
<li><a href="#step-2-use-moveit" id="markdown-toc-step-2-use-moveit">Step 2: Use Moveit</a></li>
<li><a href="#calibration-camera" id="markdown-toc-calibration-camera">Calibration camera</a></li>
</ul>
</li>
</ul>
</nav>
</aside>
<h1 id="list-of-packages">List of packages:</h1>
<ul>
<li><a href="https://github.com/ros-aldebaran/romeo_robot">ros-aldebaran/romeo_robot</a></li>
<li><a href="https://github.com/ros-nao/nao_sensors">ros-nao/nao_sensors</a></li>
<li><a href="https://github.com/ros-aldebaran/romeo_moveit_config">ros-aldebaran/romeo_moveit_config</a></li>
</ul>
<h2 id="step-1-controller-romeo-and--joint-state-publisher">Step 1: Controller Romeo and joint state publisher:</h2>
<ul>
<li>Install the package <a href="http://wiki.ros.org/ros_control">ros_control</a>:
<code class="language-plaintext highlighter-rouge">sudo apt-get install ros-indigo-ros-control ros-indigo-ros-controllers</code></li>
<li>Set AL_DIR to the path to naoqiSDK-c++ on your computer:
<ul>
<li><code class="language-plaintext highlighter-rouge">gedit ~/.bashrc</code></li>
<li>Add the following line setting the correct path to your naoqi-sdk-c++:
<code class="language-plaintext highlighter-rouge">export AL_DIR=/local/soft/naoqi/naoqi-sdk-2.1.0.19-linux64</code></li>
</ul>
</li>
<li>Clone the repository <a href="https://github.com/ros-aldebaran/romeo_robot">ros-aldebaran/romeo_robot</a> in your <code class="language-plaintext highlighter-rouge">catkin_ws</code></li>
<li>Go via terminal in your <code class="language-plaintext highlighter-rouge">catkin_ws</code> and build the packages with <code class="language-plaintext highlighter-rouge">catkin_make</code></li>
<li>Source your workspace:
<code class="language-plaintext highlighter-rouge">source devel/setup.bash</code></li>
<li>Now you can run:
<code class="language-plaintext highlighter-rouge">roslaunch romeo_dcm_bringup romeo_dcm_bringup_remote.launch</code></li>
</ul>
<h2 id="step-2-use-moveit">Step 2: Use Moveit</h2>
<ul>
<li>Follow the Step 1</li>
<li>Install Moveit in Indigo:
<ul>
<li>Open synaptic package manager and search for <code class="language-plaintext highlighter-rouge">moveit</code></li>
<li>Install the packages you find (I don’t know exactly with packages are required so I installed all of them, except those with <code class="language-plaintext highlighter-rouge">config</code> )</li>
</ul>
</li>
<li>Clone in your repository <a href="https://github.com/ros-aldebaran/romeo_moveit_config">ros-aldebaran/romeo_moveit_config</a></li>
<li>Go via terminal in your <code class="language-plaintext highlighter-rouge">catkin_ws</code> and build the packages with <code class="language-plaintext highlighter-rouge">catkin_make</code></li>
<li>Source your workspace:
<code class="language-plaintext highlighter-rouge">source devel/setup.bash</code></li>
<li>Now we can run the <code class="language-plaintext highlighter-rouge">dcm_bringup</code>
<code class="language-plaintext highlighter-rouge">roslaunch romeo_dcm_bringup romeo_dcm_bringup_remote.launch</code></li>
<li>Wait until <code class="language-plaintext highlighter-rouge">romeo_dcm_bringup</code> node is ready, then run MoveIt:
<code class="language-plaintext highlighter-rouge">roslaunch romeo_moveit_config moveit_planner_romeo.launch</code></li>
</ul>
<h2 id="calibration-camera">Calibration camera</h2>
<p>Calibrate the camera:
<code class="language-plaintext highlighter-rouge">rosrun camera_calibration cameracalibrator.py --size 9x6 --square 0.025 --no-service-check</code>
<code class="language-plaintext highlighter-rouge">image:=/nao_camera/image_raw camera:=/nao_camera</code></p>
<p>http://wiki.ros.org/camera_calibration/Tutorials/MonocularCalibration</p>
<p>To converver ini to yaml:</p>
<p>http://wiki.ros.org/camera_calibration_parsers</p>Giovanni ClaudioHow to control Romeo via ROS.