<?xml version="1.0" encoding="iso-8859-1" ?>
<rss version="2.0" 
   xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule" 
   xmlns:html="http://www.w3.org/1999/html" 
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
   xmlns:slash="http://purl.org/rss/1.0/modules/slash/">
<channel>
   <title>pmeerw's blog</title>
   <link>http://pmeerw.dyndns.org/blog</link>
   <description>total garbage comes to mind</description>
   <language>en</language>
   <copyright>Copyright 2007-2011 Peter Meerwald</copyright>
   <ttl>60</ttl>
   <pubDate>Sat, 02 Feb 2013 12:39 GMT</pubDate>
   <managingEditor>pmeerw@pmeerw.net</managingEditor>
   <generator>PyBlosxom http://pyblosxom.sourceforge.net/ 1.4.3 01/10/2008</generator>
<item>
   <title><h2>D0x3d: Das Spiel</h2></title>
   <guid isPermaLink="false">fun/d0x3d</guid>
   <link>http://pmeerw.dyndns.org/blog/fun/d0x3d.html</link>
   <description><![CDATA[

<p><a href="https://github.com/TableTopSecurity/d0x3d-the-game">d0xed</a> ist ein kooperatives Brettspiel, bei dem Hacker versuchen Informationen aus einem Netzwerk zu entführen. D0x3d ist Hacker-slang für das Veröffentlichen von -- meist belastender -- Information.</p>

<p>Beim monatlichen <a href="http://sbg.chaostreff.at">Salzburger Chaostreff</a> am 1. Februar haben wir die frei erhältlichen (<a href="http://creativecommons.org/licenses/by-nc-sa/3.0/">Creative Commons BY-NC-SA-3.0</a> Lizenz) Papierkarten mit viel Einsatz, Schere und Klebstoff zusammengebastelt -- und natürlich das Spiel ausgetestet.</p>

<img src="/blog/images/d0x3d_3.jpg" />

<p>Ziel des Spiel ist es, vier verschiedene Arten von Information von den Rechnern eines Netzwerkes zu sammeln (zu 'lo0ten') und unerkannt durch das Internet Gateway zu verschwinden. Dabei kooperieren bis zu vier Hacker mit spezifischen Fähigkeiten: es gibt etwa den Kryptografen, den War Driver, den Social Engineer, etc. Nach jedem Zug der Hacker versuchen die Admins des Netzwerks die kompromittierten Server zu patchen, und werden dabei immer achtsamer. Nur durch Einsatz von wertvollen 'zero-days' können die Angreifer ihre Präsenz verschleiern... Und was bedeutet eigentlich 'kompromittiert'?</p>

<img src="/blog/images/d0x3d_1.jpg" />&nbsp;<br>
<!-- <img src="/blog/images/d0x3d_2.jpg" /> -->

<p>Durch Ändern der Netzwerk-Topologie oder des anfänglichen Threat-Levels kann das Spiel schwieriger gestaltet werden und es ergeben sich immer wieder neue Konstellationen. Hat Spass gemacht!</p>

<p>PS: Selbstverständlich waren die Hacker bei beiden Testspielen erfolgreich, wenn auch nur knapp.</p>

]]></description>
   <category domain="http://pmeerw.dyndns.org/blog">/fun</category>
   <pubDate>Sat, 02 Feb 2013 12:39 GMT</pubDate>
</item>
<item>
   <title>Getr&auml;nker&uuml;ckgabeautomat rebooting</title>
   <guid isPermaLink="false">fun/getraenksautomat</guid>
   <link>http://pmeerw.dyndns.org/blog/fun/getraenksautomat.html</link>
   <description><![CDATA[
<p>
Grocery shopping becomes interesting when the bottle recycling machine resets itself and displays its IP address...
</p>

<p>
Pressing the button for a receipt spits out approximately one meter of status and log information of the machine. I'd like to know what CheatLimit, StrictAnticheat and ReceiptOnCheat means <img class="smiley" src="/images/smileys/happy.gif" alt=":-)" />
<img src="/blog/images/12120005.jpg">
&nbsp;
<img src="/blog/images/12120006.jpg">
</p>

]]></description>
   <category domain="http://pmeerw.dyndns.org/blog">/fun</category>
   <pubDate>Sat, 08 Dec 2012 17:05 GMT</pubDate>
</item>
<item>
   <title>Adventures in fast 16-bit audio sample to float conversion</title>
   <guid isPermaLink="false">programming/floatconv</guid>
   <link>http://pmeerw.dyndns.org/blog/programming/floatconv.html</link>
   <description><![CDATA[
<p>
How to efficiently convert an audio sample in 16-bit signed integer format
to a 32-bit float value on an ARM NEON CPU? And how to achieve bit-exact results?
</p>
<p>
There are several ways to do it in different projects:
<table border="1">
<tr>
<td></td>
<td><code>s16 -> float</code></td>
<td><code>float -> s16</code></td>
</tr>
<tr>
<td><a href="www.pulseaudio.org/">PulseAudio</a></td>
<td><code>flt = sample / (float) 0x7fff;</code></td>
<td><code>sample = lrintf(clip_flt(flt) * 0x7fff)</code></td>
</tr>
<tr>
<td><a href="http://libav.org">libavresample</a></td>
<td><code>flt = sample / (float) (1<<15);</code></td>
<td><code>sample = (s16) clip_s16(lrintf(flt * (1 << 15)));</code></td>
</tr>
<tr>
<td><a href="http://www.music.mcgill.ca/~gary/rtaudio/">RtAudio</a></td>
<td><code>flt = (sample + 0.5f) * (1 / 32767.5f);</code></td>
<td><code>sample = (s16) (flt * 32767.5f - 0.5f);</code></td>
</tr>
</table>

<code>clip_s16()</code> saturates a 16-bit short integer (-32768..32767); <code>clip_flt()</code> returns a float -1.0..1.0.
</p>

<p>
Observations regarding PulseAudio:
<ol>
<li> <code>flt_to_s16(s16_to_flt(x)) != x</code> for <code>x == -32768</code>
<li> <code>x / (float) 0x7fff != x * (1.0f / 0x7fff)</code>
on the other hand
<code>x / (float) (1<<15) == x * (1.0f / (1<<15))</code>
and the second form would allow to avoid division in favour of
multiplication of the inverse; the problem with the first form is a slight deviation for certain input values
<li>CPUs saturation instructions cannot be directly used; float_to_s16() in PulseAudio never produces -32768
</ol>
</p>

<p>
<code>lrintf()</code> rounds according to the current <a href="http://en.wikipedia.org/wiki/Rounding">rounding mode</a> which by default is
round-toward-nearest integer, toward-even for tie breaking). For example, this means:
<pre>
12.3 -> 12
12.5 -> 12 (!)
12.7 -> 13
13.3 -> 13
13.5 -> 14 (!)
13.7 -> 14
</pre>
So .5 values are rounded to an even value.
</p>

<p>
<h4>Efficient float_to_s16() on ARM NEON</h4>
I'm just showing the initialization and the code in the inner loop, processing 4 samples at a time.
<pre>
static void float_to_s16(const float *src, int16_t *dst) {
  __asm__ __volatile__ (
    "vdup.f32   q2, %[two23]            \n\t"
    "vdup.f32   q3, %[scale]            \n\t"
    "vdup.u32   q4, %[mask]             \n\t"
    "vdup.f32   q5, %[mone]             \n\t"

    "vld1.32    {q0}, [%[src]]!         \n\t" /* load x */
    "vmaxq.f32  q0, q0, q5              \n\t" /* clip at -1.0 */
    "vmul.f32   q0, q0, q3              \n\t" /* scale */
    "vand.u32   q1, q0, q4              \n\t" /* get sign bit */
    "vorr.u32   q1, q1, q2              \n\t" /* put sign on 2^23 */
    "vadd.f32   q0, q1, q0              \n\t" /* sgn(x)*2^23 + x ... */
    "vsub.f32   q0, q0, q1              \n\t" /* ... - sgn(x)*2^23 */
    "vcvt.s32.f32 q0, q0                \n\t" /* convert to int */
    "vqmovn.s32  d0, q0                 \n\t" /* saturate and narrow */
    "vst1.16    {d0}, [%[dst]]!         \n\t"
                                                                       
    : [dst] "+r" (dst), [src] "+r" (src) /* output operands (or input operands that get modified) */
    : [scale] "r" (32767.0f), [two23] "r" (8.3886080000e+06f), [mask] "r" (0x80000000), [mone] "r" (-1.0f) /* input operands */
    : "memory", "cc", "q0", "q1", "q2", "q3", "q4", "q5" /* clobber list */                                        
  );
}
</pre>
Observations:
<ul>
<li>the <code>vmaxq</code> instruction is needed to match PulseAudio's clipping sematics (clip -1.0f to -32767 instead of -32768); otherwise, <code>vqmovn</code> takes care of narrowing a 32-bit signed integer to 16-bit and saturation.
<li>the lrintf() rounding sematic (nearest integer, even tie breaking) is achieved by some a floating point trickery I have taken from the GNU C library implementation 
(<a href="http://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/ieee754/flt-32/s_lrintf.c;hb=HEAD">s_lrintf()</a>): the floating-point value of 2^23 is added and subtracted, depending on the sign of the input,  in order to round. This can be done efficiently by extracting the sign bit from the input (<code>vand</code>) and or-ing the sign to the <code>two23</code> value.
</ul>
So for <code>lrintf()</code> rounding we have: one extra <code>vand, vor, vadd, vsub</code> -- not bad!
</p>

<p>
<h4>Efficient s16_to_float() on ARM NEON</h4>
On we turn to the inverse operation in a similar manner. Goal is to obtain bit-extact results for the operation <code>sample / (float) 0x7fff;</code> without actually performing the costly division. I am not saying that this makes much sense, but hey, we can <img class="smiley" src="/images/smileys/happy.gif" alt=":-)" />
</p>

<p>
First we observe that a discrepancy (i.e. <code>sample / (float) 0x7fff != sample * (1.0f / 0x7fff)</code>) occurs when the binary representation of the input value 
converted to float ends in 0x4000 (that is, <code>q0 & 0xffff == 0x4000</code>after the <code>vcvt</code> instruction). There are 1536 such problematic values over all possible inputs.
<pre>
static void s16_to_float(const int16_t *src, float *dst) {
  __asm__ __volatile__ (
    "vdup.f32   q1, %[invscale]         \n\t"
    "vdup.u16   q3, %[mask]             \n\t"
    "vdup.u32   q4, %[one]              \n\t"

    "vld1.16    {d0}, [%[src]]!         \n\t" /* load x */
    "vmovl.s16  q0, d0                  \n\t" /* s16 -> s32 */
    "vcvt.f32.s32 q0, q0                \n\t" /* s32 -> float */
                  
    "vceq.u16   q2, q0, q3              \n\t" /* check for defect */
    "vand.u32   q2, q2, q4              \n\t" /* prepare 1 if defect */
                                  
    "vmul.f32   q0, q0, q1              \n\t" /* multiply by invscale */
    "vadd.u32   q0, q0, q2              \n\t" /* correct if defect */
    "vst1.32    {q0}, [%[dst]]!         \n\t"

    : [dst] "+r" (dst), [src] "+r" (src) /* output operands (or input operands that get modified) */
    : [invscale] "r" (invscale), [mask] "r" (0x4000), [one] "r" (1) /* input operands */
    : "memory", "cc", "q0", "q1", "q2", "q3", "q4" /* clobber list */
  );
}
</pre>
Observations:
<ul>
<li>So <code>vceq</code> check for the problem condition; it sets each matching 16-bit word to 0xffff if is 0x4000.
<li><code>vand</code> just keeps the LSB in each 32-bit word, hence a 1 indicated the problem condition for each float.
<li>Finally, <code>vadd</code> adds the correction bit to the multiplication result -- making multiplication by the inverse of 0x7fff identical to divsion by 0x7fff.
</ul>
So for the exact conversion / division we have: one extra <code>vceq, vand, vadd</code> -- not bad!
</p>


]]></description>
   <category domain="http://pmeerw.dyndns.org/blog">/programming</category>
   <pubDate>Mon, 29 Oct 2012 10:17 GMT</pubDate>
</item>
<item>
   <title>New phone: Nokia N9</title>
   <guid isPermaLink="false">n9</guid>
   <link>http://pmeerw.dyndns.org/blog/n9.html</link>
   <description><![CDATA[
<p>
The Linux phone from Nokia, the N9, is available in Austria -- for 'free'
with a 2 year service contract from <a href="http://drei.at">drei.at</a>.
</p>
<p>
So far, I'm quite happy with it and there is a no-stress root shell.
</p>

]]></description>
   <category domain="http://pmeerw.dyndns.org/blog"></category>
   <pubDate>Sat, 27 Oct 2012 13:27 GMT</pubDate>
</item>
<item>
   <title>STM32F4 stuff</title>
   <guid isPermaLink="false">stm32f4</guid>
   <link>http://pmeerw.dyndns.org/blog/stm32f4.html</link>
   <description><![CDATA[
<p>

<pre>
http://www.st.com/internet/evalboard/product/252419.jsp

https://github.com/texane/stlink

http://jeremyherbert.net/get/stm32f4_getting_started

https://www.das-labor.org/trac/browser/microcontroller/src-stm32f4xx/serialUSB

https://github.com/nabilt/STM32F4-Discovery-Firmware

Programming STM32 F2, F4 ARMs under Linux: A Tutorial from Scratch
http://www.triplespark.net/elec/pdev/arm/stm32.html
</pre>

</p>

]]></description>
   <category domain="http://pmeerw.dyndns.org/blog"></category>
   <pubDate>Wed, 10 Oct 2012 20:50 GMT</pubDate>
</item>
<item>
   <title>Building an inverter (using a NPN transistor)</title>
   <guid isPermaLink="false">not</guid>
   <link>http://pmeerw.dyndns.org/blog/not.html</link>
   <description><![CDATA[
<p>
I totally suck in electronics and anything electricity related. Current is not my friend. That's why I have to build a crappy NOT circuit
using the help of the almighty search engine.
</p>
<p>
Here is what I built with a BC547 NPN transistor that I had lying around; using a 10 KOhm resistor at the collector and a 1 KOhm resistor at the base.
</p>
<p>
<img src="/blog/images/IMG_8160.JPG">
</p>
<p>
Let's see if I can use above thing to invert a serial TTL signal to feed into a RS-232 UART.
</p>


]]></description>
   <category domain="http://pmeerw.dyndns.org/blog"></category>
   <pubDate>Sat, 06 Oct 2012 13:29 GMT</pubDate>
</item>
<item>
   <title>MaKey MaKey has arrived!</title>
   <guid isPermaLink="false">fun/makey</guid>
   <link>http://pmeerw.dyndns.org/blog/fun/makey.html</link>
   <description><![CDATA[
<p>
We just got two <a href="http://www.makeymakey.com/">MaKey MaKey</a>'s;
watch the <a href="http://www.youtube.com/watch?v=rfQqh7iCcOU">video</a>
and see what it is!
</p>
<p>
In short, it's an USB HID mouse/keyboard device provided by an 
Arduino Leonardo (vendor:product 0x2341:0x8036). The Makey Makey firmware detects closed switches on digital input pins using 50 mega ohms pullups.
</p>


]]></description>
   <category domain="http://pmeerw.dyndns.org/blog">/fun</category>
   <pubDate>Tue, 02 Oct 2012 19:32 GMT</pubDate>
</item>
</channel>
</rss>
