For a large portion of real-life utterances, the intention cannot be solely decided by either their semantics or syntax. Although all the socio-linguistic and pragmatic information cannot be digitized, at least phonetic features are indispensable in understanding the spoken language. Especially in head-final languages such as Korean, sentence-final intonation has great importance in identifying the speaker's intention. This paper suggests a system which identifies the intention of an utterance, given its acoustic feature and text. The proposed multi-stage classification system decides whether given utterance is a fragment, statement, question, command, or a rhetorical one, utilizing the intonation-dependency coming from head-finality. Based on an intuitive understanding of Korean language which is engaged in data annotation, we construct a network identifying the intention of a speech and validate its utility with sample sentences. The system, if combined with the speech recognizers, is expected to be flexibly inserted into various language understanding modules.